Обсуждение: OutOfMemory

Поиск
Список
Период
Сортировка

OutOfMemory

От
postgres@nitwit.de
Дата:
Hi!

I'm not sure whether this is a Java or an DB issue. I select a lot of data
from database, more than would fit into memory - and get an
OutOfMemoryException.

Well, why is this? This is actually what the idea of an Iterator is about,
that the data is progressively fetch and not all at once put into memory,
isn't it?

Now I do have to manually run the query multiple times using LIMIT/OFFSET
(manually adapted to the amount of RAM of the host machine...).

Timo

Re: OutOfMemory

От
"Wagner,Harry"
Дата:
I'm seeing the following msg in my postgres log:

pq_recvbuf: unexpected EOF on client connection

I haven't been able to identify what is causing it, or how to recreate it.
Anyone know where to start looking?  I'm running 7.3.3.  The application
opens and closes it's own connections (no connection pooling) and runs under
tomcat 5.0.18.

Thanks... harry

Re: OutOfMemory

От
Alexander Staubo
Дата:
Earlier versions of the PostgreSQL JDBC driver do not stream data from
the back end: they fetch everything in one go.

The PostgreSQL 7.4 JDBC driver supports JDBC's setFetchSize()
operation, and will use PostgreSQL cursors internally. If you set the
fetch size to something >0, it will correctly [*] and transparently
stream data on demand. The driver works perfectly with earlier
versions of PostgreSQL.

With earlier versions of the driver, you can emulate the behaviour by
first doing this:

   stmt.executeUpdate("declare foo cursor for select * from bar");

and then for each batch, as an executeQuery():

   rs = stmt.executeQuery("fetch forward 200 from foo");

and when you're done with the cursor,

   stmt.executeUpdate("close foo");

[*] Beware of transactions with many queries. The JDBC driver never
explicitly closes its cursors, and instead relies on the back end to
close them when the transaction is committed or aborted. In my
testing, the back end consistently runs out of memory in such cases.
We are, however, talking about thousands of queries. For all I know
this may have been fixed after the 7.4.1 release.

Alexander.

on 2004-03-29 17:42 postgres@nitwit.de wrote:

> Hi!
>
> I'm not sure whether this is a Java or an DB issue. I select a lot of data
> from database, more than would fit into memory - and get an
> OutOfMemoryException.
>
> Well, why is this? This is actually what the idea of an Iterator is about,
> that the data is progressively fetch and not all at once put into memory,
> isn't it?
>
> Now I do have to manually run the query multiple times using LIMIT/OFFSET
> (manually adapted to the amount of RAM of the host machine...).
>
> Timo
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
>                http://www.postgresql.org/docs/faqs/FAQ.html
>


Re: OutOfMemory

От
Guido Fiala
Дата:
Am Montag, 29. März 2004 18:30 schrieb Alexander Staubo:
> Earlier versions of the PostgreSQL JDBC driver do not stream data from
> the back end: they fetch everything in one go.
>
> The PostgreSQL 7.4 JDBC driver supports JDBC's setFetchSize()
> operation, and will use PostgreSQL cursors internally. If you set the
> fetch size to something >0, it will correctly [*] and transparently
> stream data on demand. The driver works perfectly with earlier
> versions of PostgreSQL.
>
> With earlier versions of the driver, you can emulate the behaviour by
> first doing this:
>
>    stmt.executeUpdate("declare foo cursor for select * from bar");
>
> and then for each batch, as an executeQuery():
>
>    rs = stmt.executeQuery("fetch forward 200 from foo");
>
> and when you're done with the cursor,
>
>    stmt.executeUpdate("close foo");

By chance i'am currently at the same point, unfortunately i don't get it
working as expected.

-after calling ResultSet.last() the getRow() still reflects the fetchsize (how
to get the number of records for e.g. a progress-bar?)

-calling "ResultSet.next()" at the "last" fetched record does not fetch more
results automatically, Statement.fetchMoreResults() gives me null-pointer -
how do i actually get the next fetch?

-according to the documentation only "FETCH_FORWARD" is supported, which is
not always suitable

Does it have some meaning that it only works if
Connection.setAutoCommit(false) is used?
I had a quick look through the sources and found the term
"server-prepared-statement" is used under certain conditions - what's this
for?

I was also thinking about using the "SELECT ... LIMIT x OFFSET y" instead, but
this might lead to unexpected side effects if multiple users are changing
data - the user gets only a momentary snapshot then, if the order has changed
in between some records will never be seen, others twice and so on.

Any nice idea to solve this?

(Im using postgres 7.3 + a recent snapshot from cvs-jdbc-driver)

Guido


Re: OutOfMemory

От
Dave Cramer
Дата:
Guido,

One thing of note, is that for the driver to use a cursor based fetch,
it must be inside a transaction, in other words setAutoCommit(false);

Dave
On Tue, 2004-03-30 at 07:16, Guido Fiala wrote:
> Am Montag, 29. März 2004 18:30 schrieb Alexander Staubo:
> > Earlier versions of the PostgreSQL JDBC driver do not stream data from
> > the back end: they fetch everything in one go.
> >
> > The PostgreSQL 7.4 JDBC driver supports JDBC's setFetchSize()
> > operation, and will use PostgreSQL cursors internally. If you set the
> > fetch size to something >0, it will correctly [*] and transparently
> > stream data on demand. The driver works perfectly with earlier
> > versions of PostgreSQL.
> >
> > With earlier versions of the driver, you can emulate the behaviour by
> > first doing this:
> >
> >    stmt.executeUpdate("declare foo cursor for select * from bar");
> >
> > and then for each batch, as an executeQuery():
> >
> >    rs = stmt.executeQuery("fetch forward 200 from foo");
> >
> > and when you're done with the cursor,
> >
> >    stmt.executeUpdate("close foo");
>
> By chance i'am currently at the same point, unfortunately i don't get it
> working as expected.
>
> -after calling ResultSet.last() the getRow() still reflects the fetchsize (how
> to get the number of records for e.g. a progress-bar?)
>
> -calling "ResultSet.next()" at the "last" fetched record does not fetch more
> results automatically, Statement.fetchMoreResults() gives me null-pointer -
> how do i actually get the next fetch?
>
> -according to the documentation only "FETCH_FORWARD" is supported, which is
> not always suitable
>
> Does it have some meaning that it only works if
> Connection.setAutoCommit(false) is used?
> I had a quick look through the sources and found the term
> "server-prepared-statement" is used under certain conditions - what's this
> for?
>
> I was also thinking about using the "SELECT ... LIMIT x OFFSET y" instead, but
> this might lead to unexpected side effects if multiple users are changing
> data - the user gets only a momentary snapshot then, if the order has changed
> in between some records will never be seen, others twice and so on.
>
> Any nice idea to solve this?
>
> (Im using postgres 7.3 + a recent snapshot from cvs-jdbc-driver)
>
> Guido
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>
--
Dave Cramer
519 939 0336
ICQ # 14675561


Re: OutOfMemory

От
Oliver Jowett
Дата:
Guido Fiala wrote:
> By chance i'am currently at the same point, unfortunately i don't get it
> working as expected.
>
> -after calling ResultSet.last() the getRow() still reflects the fetchsize (how
> to get the number of records for e.g. a progress-bar?)

That sounds like a bug; last() should take you to the very last record
of the resultset independent of the fetchsize. Can you submit a testcase
showing this?

(... later ...) Hold on -- doesn't last() require a scrollable
resultset, which means you're not using a cursor behind the scenes? I
think the driver does not throw an exception if you try to use
last()/absolute()/etc with FETCH_FORWARD_ONLY (it should!) but instead
just gives you the wrong results.. not great, but this is not only the
fault of the driver :)

> -calling "ResultSet.next()" at the "last" fetched record does not fetch more
> results automatically, Statement.fetchMoreResults() gives me null-pointer -
> how do i actually get the next fetch?

It should be transparent. i.e. last() should take you to the *very end*
of the resultset, fetching forward until there are no more results. You
shouldn't ever see the boundaries between fetches (other than as a
slight delay on next()).

> -according to the documentation only "FETCH_FORWARD" is supported, which is
> not always suitable

This is simply because no-one has implemented the logic to support
scrollable resultsets backed by a cursor yet. Patches are welcome!

> Does it have some meaning that it only works if
> Connection.setAutoCommit(false) is used?

Yes -- otherwise we'd need to use a cursor WITH HOLD and manage it more
carefully. Currently we rely on end-of-transaction closing the cursors,
and that scheme doesn't work with autocommit on so we don't use cursors
in that case.

This is actually a tradeoff between storing the resultset on the backend
and storing it on the java process -- WITH HOLD is not free, someone has
to store the data once the transaction is gone. I think the backend does
a better job of paging out results etc though.

> I had a quick look through the sources and found the term
> "server-prepared-statement" is used under certain conditions - what's this
> for?

That's unrelated to cursor use; this is to do with transforming
repeatedly executed queries into a PREPARE/EXECUTE form. It's not on by
default.

> I was also thinking about using the "SELECT ... LIMIT x OFFSET y" instead, but
> this might lead to unexpected side effects if multiple users are changing
> data - the user gets only a momentary snapshot then, if the order has changed
> in between some records will never be seen, others twice and so on.
>
> Any nice idea to solve this?

Wrap your queries in a transaction if you want transactional isolation.
Also you really want an ORDER BY if you're using LIMIT/OFFSET.

-O

Re: OutOfMemory

От
Guido Fiala
Дата:
Oliver Jowett wrote:
>That sounds like a bug; last() should take you to the very last record
>of the resultset independent of the fetchsize. Can you submit a testcase
>showing this?

>(... later ...) Hold on -- doesn't last() require a scrollable
>resultset, which means you're not using a cursor behind the scenes? I
>think the driver does not throw an exception if you try to use
>last()/absolute()/etc with FETCH_FORWARD_ONLY (it should!) but instead
>just gives you the wrong results.. not great, but this is not only the
>fault of the driver :)

Yeah - at first i simply called setFetchSize(20); after creating a Statement
with
conn.createStatement(ResultSet.SCROLL_SENSITIVE,...).
If turned out to make no difference, the driver still fetched all results
into memory, which isn't easy to notice unless one uses the debugger or fill
in so many data that the OutOfMemory comes if fetchsize is not working.

So i walked through the source tree and saw that it only would use the
fetchsize by using "FETCH FORWARD ..." if the ResultSetType is set to
FETCH_FORWARD_ONLY.
This does indeed fetch only fetchsize rows but then i was stuck... so here
is my TextCase:

---
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;

public class TestFetchSize {
    //to be run with standard java-memory-settings: (-Xmx64m)
    public static void main(String[] args) {
        try {
            Class.forName("org.postgresql.Driver");
        } catch (ClassNotFoundException e1) {
            e1.printStackTrace();
        }
        try {
            //set-up:
            Connection conn =
DriverManager.getConnection("jdbc:postgresql://<server>:5432/<db>",
<user>,<password>);
            Statement ct=conn.createStatement();
            try { ct.execute("DROP TABLE tst_fetchsize");} catch (SQLException x) {};
            ct.execute("CREATE TABLE tst_fetchsize ( id bigserial primary key, a
varchar );");
            Statement
cf=conn.createStatement(ResultSet.TYPE_SCROLL_INSENSITIVE,ResultSet.CONCUR_UPDATABLE);
            ResultSet ins=cf.executeQuery("SELECT * from tst_fetchsize LIMIT 0");
            String data=new String(new byte[3000000]);
            data=data.replace('\0',' ');//very long and empty String
            System.out.println("string-length=" +data.length());
            final int count=10;
            for(int i=0;i<count;i++) {
                ins.moveToInsertRow();
                ins.updateString(2,data);//create very large records!
                ins.insertRow();
            }

            //test forward-only
            System.out.println("test: TYPE_FORWARD_ONLY");
            conn.setAutoCommit(false);//to allow cursors below
            Statement
st=conn.createStatement(ResultSet.TYPE_FORWARD_ONLY,ResultSet.CONCUR_READ_ONLY);
            st.setFetchSize(1);
            ResultSet rs=st.executeQuery("Select * from tst_fetchsize");
            rs.last();//now we should be at "count"
            System.out.println("getRow=" + rs.getRow() + " should be=" + count);
            rs.beforeFirst();
            while(rs.next()) System.out.println("this is row " + rs.getString(1));
            System.out.println("now backward:");
            while(rs.previous()) System.out.println("this is row " + rs.getString(1));

            //test scroll*
            System.out.println("test: TYPE_SCROLL_SENSITIVE, it should not give an
OutOfMemory-Exception if we fetch indeed just one record as above!");
            conn.setAutoCommit(false);//to allow cursors below

st=conn.createStatement(ResultSet.TYPE_SCROLL_SENSITIVE,ResultSet.CONCUR_READ_ONLY);
            st.setFetchSize(1);
            rs=st.executeQuery("Select * from tst_fetchsize");
            rs.last();//now we should be at "count"
            System.out.println("getRow=" + rs.getRow() + " should be=" + count +" and
is, however, the driver already fetched all rows here!!!");
            rs.beforeFirst();
            while(rs.next()) System.out.println("this is row " + rs.getString(1));
            System.out.println("now backward:");
            while(rs.previous()) System.out.println("this is row " + rs.getString(1));

            //cleanup:
            conn.setAutoCommit(true);
            ct.execute("DROP TABLE tst_fetchsize");
        } catch (SQLException e) {
            e.printStackTrace();
        }
    }
}

---

>Yes -- otherwise we'd need to use a cursor WITH HOLD and manage it more
>carefully. Currently we rely on end-of-transaction closing the cursors,
>and that scheme doesn't work with autocommit on so we don't use cursors
>in that case.

Ok, understood - the database holds all the result-data of a query in memory
already.

>This is actually a tradeoff between storing the resultset on the backend
>and storing it on the java process -- WITH HOLD is not free, someone has
>to store the data once the transaction is gone. I think the backend does
>a better job of paging out results etc though.

So it is - especially with Blobs, as JDBC needs 5 to 8 times the size for
each blob because of

1. storing the whole result in a UNICODE-StreamBuf (including the bytea-part
which is only ASCII)
2. quoting of half the characters of the blob with one (') to three (\)
characters is required.

Without using fetchsize the Java-memory is soon exhausted...

Wouldn't it be possible to stream the bytea-parts at least directly into
byte[]-arrays as they come in over the socket, somehow? (The problem is here
that we don't know the size of the bytea, need to use a "ByteBuffer" which
can grow...)

>> I was also thinking about using the "SELECT ... LIMIT x OFFSET y"
instead, but
>> this might lead to unexpected side effects if multiple users are changing
>> data - the user gets only a momentary snapshot then, if the order has
changed
>> in between some records will never be seen, others twice and so on.
>>
>> Any nice idea to solve this?

>Wrap your queries in a transaction if you want transactional isolation.

Mmm, need to test this - but if that works, why use CURSORS (which work also
only within a transaction) in the driver and not the LIMIT/OFFSET instead?
In that case, whether the frontend nor the backend would have to copy data
(into memory) which will eventually never be fully read.

>Also you really want an ORDER BY if you're using LIMIT/OFFSET.

That was just an example to point out, that between multiple SELECTS
something could change the OFFSET of the next data to be fetched... but
snapshots are just snapshots and more than doing a "READ_COMMITTED" can't be
done here.


Re: OutOfMemory

От
Oliver Jowett
Дата:
Guido Fiala wrote:
> Oliver Jowett wrote:
>
>>(... later ...) Hold on -- doesn't last() require a scrollable
>>resultset, which means you're not using a cursor behind the scenes? I
>>think the driver does not throw an exception if you try to use
>>last()/absolute()/etc with FETCH_FORWARD_ONLY (it should!) but instead
>>just gives you the wrong results.. not great, but this is not only the
>>fault of the driver :)

[...]

> So i walked through the source tree and saw that it only would use the
> fetchsize by using "FETCH FORWARD ..." if the ResultSetType is set to
> FETCH_FORWARD_ONLY.
> This does indeed fetch only fetchsize rows but then i was stuck... so here
> is my TextCase:

[...]

>             //test forward-only
>             System.out.println("test: TYPE_FORWARD_ONLY");
>             conn.setAutoCommit(false);//to allow cursors below
>             Statement
> st=conn.createStatement(ResultSet.TYPE_FORWARD_ONLY,ResultSet.CONCUR_READ_ONLY);
>             st.setFetchSize(1);
>             ResultSet rs=st.executeQuery("Select * from tst_fetchsize");
>             rs.last();//now we should be at "count"

Indeed, here's your problem -- you're not allowed to call last() on a
TYPE_FORWARD_ONLY resultset. The driver should throw an exception at
this point (but currently doesn't). See the ResultSet.last() javadoc for
details.

-O

Re: OutOfMemory

От
Guido Fiala
Дата:
Am Donnerstag, 1. April 2004 09:14 schrieb Oliver Jowett:
[...]
> > st=conn.createStatement(ResultSet.TYPE_FORWARD_ONLY,ResultSet.CONCUR_READ
> >_ONLY); st.setFetchSize(1);
> >             ResultSet rs=st.executeQuery("Select * from tst_fetchsize");
> >             rs.last();//now we should be at "count"
>
> Indeed, here's your problem -- you're not allowed to call last() on a
> TYPE_FORWARD_ONLY resultset. The driver should throw an exception at
> this point (but currently doesn't). See the ResultSet.last() javadoc for
> details.

Ok - that's one part, but why does the second part of my testcase not work as
expected - means fetchsize has no effect ?

Re: OutOfMemory

От
Oliver Jowett
Дата:
Guido Fiala wrote:
> Am Donnerstag, 1. April 2004 09:14 schrieb Oliver Jowett:
> [...]
>
>>>st=conn.createStatement(ResultSet.TYPE_FORWARD_ONLY,ResultSet.CONCUR_READ
>>>_ONLY); st.setFetchSize(1);
>>>            ResultSet rs=st.executeQuery("Select * from tst_fetchsize");
>>>            rs.last();//now we should be at "count"
>>
>>Indeed, here's your problem -- you're not allowed to call last() on a
>>TYPE_FORWARD_ONLY resultset. The driver should throw an exception at
>>this point (but currently doesn't). See the ResultSet.last() javadoc for
>>details.
>
>
> Ok - that's one part, but why does the second part of my testcase not work as
> expected - means fetchsize has no effect ?

This is the part where you use TYPE_SCROLL_SENSITIVE + setFetchsize()
and see the driver pulling the whole resultset into memory at once?

setFetchsize() is a hint. As no-one has implemented cursor-backed
support for scrollable resultsets yet, the driver ignores the hint and
fetches the whole result set anyway.

Patches to improve this behaviour are, of course, welcome :)

-O