ResultSet construction and byte arrays

Поиск
Список
Период
Сортировка
От Jan Oksfeldt Jonasen
Тема ResultSet construction and byte arrays
Дата
Msg-id 5.2.1.1.0.20040206112237.02b99a60@mail.northmann.com
обсуждение исходный текст
Ответы Re: ResultSet construction and byte arrays
Re: ResultSet construction and byte arrays
Список pgsql-jdbc
I can see from the archives this has been up more than once, but I'd still
like to take a stab at it.
Construction of ResultSets based on searches with many results or based on
large tables does have some issues, the most noticable one being loading
everything found into memory as byte arrays.

I'm currently using PostgreSQL in the version created by dbExperts (7.3.2)
with a win2k database server and a Java client application. I experienced
some serious performance issues when using these searches with somewhat
large result sets, so decided to use a profiler and the latest driver
source from CVS. The testing showed there was two major points: Garbage
collection and data loading from the server.

Looking at the source it turns out the problems come from the PGStream
class methods:
public byte[][] ReceiveTupleV2(int nf, boolean bin)
and
public byte[] Receive(int siz)
First one creating two dimentional byte arrays, second one dimentional.
Seemingly a huge amount is created, in a test on a table with 67 columns
and ~15000 tuples nearly 2 million byte arrays are created (operation takes
20+ seconds), all these are put into a Vector which in turn ends up in a
ResultSet implementation I believe to be a AbstractJdbc1ResultSet.

This is a lot of data to keep in memory. Since it's possible to have
multiple search results open (eg. swing client view) these arrays does
become a major problem until ResultSet.close() is called which nulls out
the vector. Once this happens the garbage collector kicks in and again
lowers performance, albeit for a short while.

Based on the source I've tried a few things to better this, for instance
adding a byte array pool for both the one and two dimentionals, but while
it certainly speeds up the loading time with reuse over the new operator
and less GC action, it doesn't help on memory consumption which eventually
crashes the VM.

Are there any ideas on how to fix this? Are there plans to look into it?

On Oracle (same client, a tad different code) I had a major problem using
ResultSet.absolute() since Oracle only have forward cursors natively, the
cursor would behind the scene be rolled forward, flipping over and forward
if you where to go backwards with absolute(). For instance, being at row
10.000 and going back to 9.999 would have the cursor scroll forward, flip
and forward. This created a huge amount of temporary object creation,
besides this operation just being painfully slow.

To fix this I created wrappers for multiple java.sql interfaces, this way
making sure to have full control over which ResultSet implementation was
being used. The idea was to utillize an expression similiar to (taken from
an article at OTN):
from (
  select name.*, rownum rnum
  from ( <everything in a query, including order by clause> ) name
  where rownum <= MAX_ROWS
)
where rnum >= MIN_ROWS

With the wrappers I could make sure only a certain number of rows where
found, by maintaining some state on current min and max rows. The downside
to this solution is additional queries, but still a lot faster than the
cursor movement.

If it's possible to make an expression like the above in Postgres, I think
that could also be the solution to not loading everything in a result set.
The result set should be enabled to re-query the database once some row
above its max was requested. So, how would I make such an expression?

As a last alternative, are there any commercial drivers available which
aren't just wrappers for the standard one? The ones I can search out at Sun
just seems to be.


Best regards
Jan Oksfeldt Jonasen



В списке pgsql-jdbc по дате отправления:

Предыдущее
От: "Marcus Andree S. Magalhaes"
Дата:
Сообщение: Re: killing idle postgres instances
Следующее
От: Andrew Sullivan
Дата:
Сообщение: Re: killing idle postgres instances