Обсуждение: JDBC Streaming large objects

Поиск
Список
Период
Сортировка

JDBC Streaming large objects

От
"Kevin Schroeder"
Дата:
Hello,
    I was wondering if the PostgreSQL JDBC driver has the ability to stream
an SQL query to PostgreSQL.  Looking at the archive it seems as though that
functionality is not there.  I'm writing a program that needs to have the
ability to generate the SQL query on the fly because there will be occasions
where the INSERT statements will be larger than the available memory.  If
this functionality is not yet available it means that I'll have to make some
modifications to the JDBC driver but I'd rather not do that if there is a
method of streaming the query already out there.  I also probably don't know
what I'm getting into if I were to try rewriting portions of the driver.

    So, if anyone knows if there is a way to stream data via an INSERT
statement without running into the OutOfMemoryError I'd love to hear it.

Thanks
Kevin Schroeder


Re: JDBC Streaming large objects

От
Barry Lind
Дата:
Kevin,

Can you give an example of what you are trying to do?  Is it the text of
the query that is too large to hold in memory, or is the the values that
are being bound that make it too large?

I am not sure that the jdbc spec works well for you.  Since even if you
are using PreparedStatements, you need to have all the values in memory
when you call setXXX(), since until execute() is called the SQL can't be
sent to the server, and these values can't be freed and garbage collected.

thanks,
--Barry




Kevin Schroeder wrote:
> Hello,
>     I was wondering if the PostgreSQL JDBC driver has the ability to stream
> an SQL query to PostgreSQL.  Looking at the archive it seems as though that
> functionality is not there.  I'm writing a program that needs to have the
> ability to generate the SQL query on the fly because there will be occasions
> where the INSERT statements will be larger than the available memory.  If
> this functionality is not yet available it means that I'll have to make some
> modifications to the JDBC driver but I'd rather not do that if there is a
> method of streaming the query already out there.  I also probably don't know
> what I'm getting into if I were to try rewriting portions of the driver.
>
>     So, if anyone knows if there is a way to stream data via an INSERT
> statement without running into the OutOfMemoryError I'd love to hear it.
>
> Thanks
> Kevin Schroeder
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index scan if your
>       joining column's datatypes do not match
>




Re: JDBC Streaming large objects

От
"Kevin Schroeder"
Дата:
It is the values themselves that are too large.  The query is inserting an
email into the database.  99.999999% percent of the time it's not an issue
since most emails aren't that large, but if someone sends a large email
(20+MB) you have  copy of it in memory and then when you add it to the SQL
statement you have it there again, so it's using 40MB worth of memory
(roughly, of course).  I was able to free up half of the memory by swapping
the email to the hard drive and using the setXXX() function, but there will
still be cases where Java will run out of memory.  I can increase the heap
size, which solves the problem for me, but most people probably won't think
of memory issues when they install the software or they may have other
applications that require high memory usage on the same machine.

Since I posted my first message I have put in a workaround.  It doesn't do
what I'd like, but at least it notifies people that there was a problem by
catching the OutOfMemoryError exception.

If the current JDBC spec won't be able to handle the kind of query I'm doing
perhaps someone can point me to the code in the PostgreSQL driver that I
could use to build a custom interface that would create the SQL statement on
the fly, pulling the data from the hard drive while it's sending it to
PostgreSQL.  And perhaps at some point in the future it could be beneficial
to add that feature, or something like it, to the JDBC driver.

Kevin

----- Original Message -----
From: "Barry Lind" <blind@xythos.com>
To: "Kevin Schroeder" <kschroeder@mirageworks.com>
Cc: <pgsql-jdbc@postgresql.org>
Sent: Friday, September 12, 2003 8:39 PM
Subject: Re: [JDBC] JDBC Streaming large objects


> Kevin,
>
> Can you give an example of what you are trying to do?  Is it the text of
> the query that is too large to hold in memory, or is the the values that
> are being bound that make it too large?
>
> I am not sure that the jdbc spec works well for you.  Since even if you
> are using PreparedStatements, you need to have all the values in memory
> when you call setXXX(), since until execute() is called the SQL can't be
> sent to the server, and these values can't be freed and garbage collected.
>
> thanks,
> --Barry
>
>
>
>
> Kevin Schroeder wrote:
> > Hello,
> >     I was wondering if the PostgreSQL JDBC driver has the ability to
stream
> > an SQL query to PostgreSQL.  Looking at the archive it seems as though
that
> > functionality is not there.  I'm writing a program that needs to have
the
> > ability to generate the SQL query on the fly because there will be
occasions
> > where the INSERT statements will be larger than the available memory.
If
> > this functionality is not yet available it means that I'll have to make
some
> > modifications to the JDBC driver but I'd rather not do that if there is
a
> > method of streaming the query already out there.  I also probably don't
know
> > what I'm getting into if I were to try rewriting portions of the driver.
> >
> >     So, if anyone knows if there is a way to stream data via an INSERT
> > statement without running into the OutOfMemoryError I'd love to hear it.
> >
> > Thanks
> > Kevin Schroeder
> >
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 9: the planner will ignore your desire to choose an index scan if
your
> >       joining column's datatypes do not match
> >
>
>
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
>                http://archives.postgresql.org


Re: JDBC Streaming large objects

От
Andreas Prohaska
Дата:
If the values are too large, have you tried storing them as
large objects? You can then use Postgres own Large Object API
that allows you to stream binary data to the backend without
having to keep a copy of the data in memory.

I had the same problem/question just some few days ago. See
"Difference between Blob and Large Object API".


> -----Ursprüngliche Nachricht-----
> Von: Kevin Schroeder [mailto:kschroeder@mirageworks.com]
> Gesendet: Samstag, 13. September 2003 13:55
> An: pgsql-jdbc@postgresql.org
> Betreff: Re: [JDBC] JDBC Streaming large objects
>
>
> It is the values themselves that are too large.  The query is
> inserting an
> email into the database.  99.999999% percent of the time it's
> not an issue
> since most emails aren't that large, but if someone sends a
> large email
> (20+MB) you have  copy of it in memory and then when you add
> it to the SQL
> statement you have it there again, so it's using 40MB worth of memory
> (roughly, of course).  I was able to free up half of the
> memory by swapping
> the email to the hard drive and using the setXXX() function,
> but there will
> still be cases where Java will run out of memory.  I can
> increase the heap
> size, which solves the problem for me, but most people
> probably won't think
> of memory issues when they install the software or they may have other
> applications that require high memory usage on the same machine.
>
> Since I posted my first message I have put in a workaround.
> It doesn't do
> what I'd like, but at least it notifies people that there was
> a problem by
> catching the OutOfMemoryError exception.
>
> If the current JDBC spec won't be able to handle the kind of
> query I'm doing
> perhaps someone can point me to the code in the PostgreSQL
> driver that I
> could use to build a custom interface that would create the
> SQL statement on
> the fly, pulling the data from the hard drive while it's sending it to
> PostgreSQL.  And perhaps at some point in the future it could
> be beneficial
> to add that feature, or something like it, to the JDBC driver.
>
> Kevin
>
> ----- Original Message -----
> From: "Barry Lind" <blind@xythos.com>
> To: "Kevin Schroeder" <kschroeder@mirageworks.com>
> Cc: <pgsql-jdbc@postgresql.org>
> Sent: Friday, September 12, 2003 8:39 PM
> Subject: Re: [JDBC] JDBC Streaming large objects
>
>
> > Kevin,
> >
> > Can you give an example of what you are trying to do?  Is
> it the text of
> > the query that is too large to hold in memory, or is the
> the values that
> > are being bound that make it too large?
> >
> > I am not sure that the jdbc spec works well for you.  Since
> even if you
> > are using PreparedStatements, you need to have all the
> values in memory
> > when you call setXXX(), since until execute() is called the
> SQL can't be
> > sent to the server, and these values can't be freed and
> garbage collected.
> >
> > thanks,
> > --Barry
> >
> >
> >
> >
> > Kevin Schroeder wrote:
> > > Hello,
> > >     I was wondering if the PostgreSQL JDBC driver has the
> ability to
> stream
> > > an SQL query to PostgreSQL.  Looking at the archive it
> seems as though
> that
> > > functionality is not there.  I'm writing a program that
> needs to have
> the
> > > ability to generate the SQL query on the fly because there will be
> occasions
> > > where the INSERT statements will be larger than the
> available memory.
> If
> > > this functionality is not yet available it means that
> I'll have to make
> some
> > > modifications to the JDBC driver but I'd rather not do
> that if there is
> a
> > > method of streaming the query already out there.  I also
> probably don't
> know
> > > what I'm getting into if I were to try rewriting portions
> of the driver.
> > >
> > >     So, if anyone knows if there is a way to stream data
> via an INSERT
> > > statement without running into the OutOfMemoryError I'd
> love to hear it.
> > >
> > > Thanks
> > > Kevin Schroeder
> > >
> > >
> > > ---------------------------(end of
> broadcast)---------------------------
> > > TIP 9: the planner will ignore your desire to choose an
> index scan if
> your
> > >       joining column's datatypes do not match
> > >
> >
> >
> >
> >
> > ---------------------------(end of
> broadcast)---------------------------
> > TIP 6: Have you searched our list archives?
> >
> >                http://archives.postgresql.org
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 9: the planner will ignore your desire to choose an index
> scan if your
>       joining column's datatypes do not match
>

Re: JDBC Streaming large objects

От
"Kevin Schroeder"
Дата:
I was wondering about the Large Object API.  I saw it in the JDBC source
code and found some documentation but the code I saw seemed to handle it in
a similar fashion as the JDBC method except that the data was written into a
buffer beforehand.  I'll take a closer look and see if I can kick it hard
enough to make it work.  ;-)

Thanks
Kevin

----- Original Message -----
From: "Andreas Prohaska" <ap@apeiron.de>
To: "'Kevin Schroeder'" <kschroeder@mirageworks.com>;
<pgsql-jdbc@postgresql.org>
Sent: Saturday, September 13, 2003 8:42 AM
Subject: AW: [JDBC] JDBC Streaming large objects


> If the values are too large, have you tried storing them as
> large objects? You can then use Postgres own Large Object API
> that allows you to stream binary data to the backend without
> having to keep a copy of the data in memory.
>
> I had the same problem/question just some few days ago. See
> "Difference between Blob and Large Object API".
>
>
> > -----Ursprüngliche Nachricht-----
> > Von: Kevin Schroeder [mailto:kschroeder@mirageworks.com]
> > Gesendet: Samstag, 13. September 2003 13:55
> > An: pgsql-jdbc@postgresql.org
> > Betreff: Re: [JDBC] JDBC Streaming large objects
> >
> >
> > It is the values themselves that are too large.  The query is
> > inserting an
> > email into the database.  99.999999% percent of the time it's
> > not an issue
> > since most emails aren't that large, but if someone sends a
> > large email
> > (20+MB) you have  copy of it in memory and then when you add
> > it to the SQL
> > statement you have it there again, so it's using 40MB worth of memory
> > (roughly, of course).  I was able to free up half of the
> > memory by swapping
> > the email to the hard drive and using the setXXX() function,
> > but there will
> > still be cases where Java will run out of memory.  I can
> > increase the heap
> > size, which solves the problem for me, but most people
> > probably won't think
> > of memory issues when they install the software or they may have other
> > applications that require high memory usage on the same machine.
> >
> > Since I posted my first message I have put in a workaround.
> > It doesn't do
> > what I'd like, but at least it notifies people that there was
> > a problem by
> > catching the OutOfMemoryError exception.
> >
> > If the current JDBC spec won't be able to handle the kind of
> > query I'm doing
> > perhaps someone can point me to the code in the PostgreSQL
> > driver that I
> > could use to build a custom interface that would create the
> > SQL statement on
> > the fly, pulling the data from the hard drive while it's sending it to
> > PostgreSQL.  And perhaps at some point in the future it could
> > be beneficial
> > to add that feature, or something like it, to the JDBC driver.
> >
> > Kevin
> >
> > ----- Original Message -----
> > From: "Barry Lind" <blind@xythos.com>
> > To: "Kevin Schroeder" <kschroeder@mirageworks.com>
> > Cc: <pgsql-jdbc@postgresql.org>
> > Sent: Friday, September 12, 2003 8:39 PM
> > Subject: Re: [JDBC] JDBC Streaming large objects
> >
> >
> > > Kevin,
> > >
> > > Can you give an example of what you are trying to do?  Is
> > it the text of
> > > the query that is too large to hold in memory, or is the
> > the values that
> > > are being bound that make it too large?
> > >
> > > I am not sure that the jdbc spec works well for you.  Since
> > even if you
> > > are using PreparedStatements, you need to have all the
> > values in memory
> > > when you call setXXX(), since until execute() is called the
> > SQL can't be
> > > sent to the server, and these values can't be freed and
> > garbage collected.
> > >
> > > thanks,
> > > --Barry
> > >
> > >
> > >
> > >
> > > Kevin Schroeder wrote:
> > > > Hello,
> > > >     I was wondering if the PostgreSQL JDBC driver has the
> > ability to
> > stream
> > > > an SQL query to PostgreSQL.  Looking at the archive it
> > seems as though
> > that
> > > > functionality is not there.  I'm writing a program that
> > needs to have
> > the
> > > > ability to generate the SQL query on the fly because there will be
> > occasions
> > > > where the INSERT statements will be larger than the
> > available memory.
> > If
> > > > this functionality is not yet available it means that
> > I'll have to make
> > some
> > > > modifications to the JDBC driver but I'd rather not do
> > that if there is
> > a
> > > > method of streaming the query already out there.  I also
> > probably don't
> > know
> > > > what I'm getting into if I were to try rewriting portions
> > of the driver.
> > > >
> > > >     So, if anyone knows if there is a way to stream data
> > via an INSERT
> > > > statement without running into the OutOfMemoryError I'd
> > love to hear it.
> > > >
> > > > Thanks
> > > > Kevin Schroeder
> > > >
> > > >
> > > > ---------------------------(end of
> > broadcast)---------------------------
> > > > TIP 9: the planner will ignore your desire to choose an
> > index scan if
> > your
> > > >       joining column's datatypes do not match
> > > >
> > >
> > >
> > >
> > >
> > > ---------------------------(end of
> > broadcast)---------------------------
> > > TIP 6: Have you searched our list archives?
> > >
> > >                http://archives.postgresql.org
> >
> >
> > ---------------------------(end of
> > broadcast)---------------------------
> > TIP 9: the planner will ignore your desire to choose an index
> > scan if your
> >       joining column's datatypes do not match
> >