Re: setBlob loop performance?
От | David Wall |
---|---|
Тема | Re: setBlob loop performance? |
Дата | |
Msg-id | 004201c24e4a$c191cee0$3201a8c0@expertrade.com обсуждение исходный текст |
Ответ на | setBlob loop performance? ("David Wall" <dwall@Yozons.com>) |
Список | pgsql-jdbc |
> Both of these issues look like bugs to me. Please submit a patch. I > agree that using a 4 or 8k buffer for reading/writing between the > streams is much better than how the code is currently implemented. > Also, setBlob should be calling close on the input stream when it is done. Oddly enough, I don't have the capability to test this just yet, but I thought I'd share the code as I think would work better for org.postgresql.jdbc2.PreparedStatement. Of course, if anybody sees anything screwy, please let me know. The main changes are the closing of the two streams in the finally block so that even if an exception is thrown, the streams are closed, and the primary loop attempts to read/write 4k blocks. I hope to be able to test shortly, but we're in the process of converting the database to 7.2.2 and the sources from getBytes/setBytes to getBlob/setBlob (we thought this would be easier than converting existing OIDs to bytea in the db). David public void setBlob(int i, Blob x) throws SQLException { InputStream l_inStream = x.getBinaryStream(); LargeObjectManager lom = connection.getLargeObjectAPI(); int oid = lom.create(); LargeObject lob = lom.open(oid); OutputStream los = lob.getOutputStream(); byte[] buf = new byte[4096]; try { // could be buffered, but then the OutputStream returned by LargeObject // is buffered internally anyhow, so there would be no performance // boost gained, if anything it would be worse! int bytesRemaining = (int)x.length(); int numRead = l_inStream.read(buf,0,Math.min(buf.length,bytesRemaining)); while (numRead != -1 && bytesRemaining > 0) { bytesRemaining -= numRead; los.write(buf,0,numRead); numRead = l_inStream.read(buf,0,Math.min(buf.length,bytesRemaining)); } } catch (IOException se) { throw new PSQLException("postgresql.unusual", se); } finally { try { los.close(); l_inStream.close(); } catch( Exception e ) {} } // lob is closed by the stream so don't call lob.close() setInt(i, oid); } > Is there anything else that can be done to improve this. I have always > found the jdbc spec for Blobs to be limited in the blob creation area. > Is there anything you would like to see in the driver to make this > easier for you? I don't think so, but it's interesting to see the API and wonder if anybody actually coded around it. For example, assume you have a nice InputStream to a blob on disk. Getting that into a Blob field would be a pain since you'd have to wrap the InputStream in your object that implements Blob and then have that interface drive the blob insertion. If you already have a byte array, you still have to wrap it insert it as a blob. I suppose a really smart JDBC could use setBytes()/getBytes() and simply handle the conversions internally if the data field was actually a blob and not a longvarbinary. It's also interesting that the interface returns all numbers as a long, including the length, but the read requires it to be an int, meaning you can attempt to suck in an entire blob in a single call to getBytes(). Oh well... David > David Wall wrote: > > >In the 7.2.2 codeset, PreparedStatement.setBlob() shows a loop as it reads a > >byte from the input stream (the blob) and writes it to the output stream > >(PG's LO routines). > > > >This seems highly inefficient since most large objects are, well, large... > >So if I want to insert a 1MB image, this will loop a million times. Is > >there a reason it's not read in chunks (even a 4096 sized array would reduce > >such a loop down to 250 iterations)? > > > >This is much worse than the 7.1 code which simply took my byte array and > >wrote it all to the LargeObject stream in one call. > > > >+++ > > public void setBlob(int i, Blob x) throws SQLException > > { > > InputStream l_inStream = x.getBinaryStream(); > > int l_length = (int) x.length(); > > LargeObjectManager lom = connection.getLargeObjectAPI(); > > int oid = lom.create(); > > LargeObject lob = lom.open(oid); > > OutputStream los = lob.getOutputStream(); > > try > > { > > // could be buffered, but then the OutputStream > >returned by LargeObject > > // is buffered internally anyhow, so there would be > >no performance > > // boost gained, if anything it would be worse! > > int c = l_inStream.read(); > > int p = 0; > > while (c > -1 && p < l_length) > > { > > los.write(c); > > c = l_inStream.read(); > > p++; > > } > > los.close(); > > } > > catch (IOException se) > > { > > throw new PSQLException("postgresql.unusual", se); > > } > > // lob is closed by the stream so don't call lob.close() > > setInt(i, oid); > > } > > > >+++ > > > >Since the getBinaryStream() returns an InputStream, should this routine > >close that inputstream once it's done, or does the Blob itself have to > >somehow know that a stream it creates can be closed and discarded (and if > >so, how?)?
В списке pgsql-jdbc по дате отправления: