Обсуждение: Batches of single-insert statements vs batches of multi-insert statements
Batches of single-insert statements vs batches of multi-insert statements
I tried all sorts of query rewrite to boost a huge sequence of insert statements and here are my findings. All is run on a local Postgres 9.5.3, on a 64-bit Windows box with 32GB RAM, i7@3.6GHz.
1. Batches of single insert statements (original case):
Batch of: INSERT INTO SomeTable (Col1, Col2) VALUES (?, ?)
-> 52 seconds
2. Batches of single insert statements to an unlogged table, then copy:
CREATE UNLOGGED TABLE SomeUnloggedTable AS SELECT * FROM SomeTable WHERE 1 = 0
Batch of: INSERT INTO SomeUnloggedTable (Col1, Col2) VALUES (?, ?)
INSERT INTO SomeTable SELECT * FROM SomeUnloggedTable
DROP TABLE SomeUnloggedTable
-> 43 seconds (15 seconds in the copy from unlogged to target table)
3. Batches of 100 multi-insert (followed by batches of single inserts for remainder):
Batch of: INSERT INTO SomeTable (Col1, Col2) VALUES (?, ?), (?, ?), (?, ?)...
-> 30 seconds
So, rewriting the batch of single insert statements to batches of 100 multi-insert statements almost doubled the speed.
Could someone educate me as to what happens internally to the JDBC single-insert batch that makes it so different from a multi-insert batch?
Rewriting our code to use multi-value inserts is more cumbersome and not applicable everywhere. It also makes the code database-specific as all the engines we support do not all handle this syntax.
These considerations actually lead me to this old thread:
https://www.postgresql.org/message-id/55130DC8.2070508%40redhat.com
Anyway, these tests definitely show that the performance of the standard JDBC batch API could potentially be improved in Postgres internally with a bit of magic :)
Isn't it possible (either in the driver or in the engine) to aggregate the single-insert statements of a batch to turn them into a multi-insert equivalent? It would not have to handle all cases: simply handling basic commands like "INSERT INTO X (a, b, c, ...) VALUES (?, ?, ?, ...)" would help a lot.
If this should not be done in the JDBC driver, then in which area could such handling be done?
Or have I missed something?
If needed, I am ready to offer my assistance with testing (I can tweak parameters, install dev drivers, modify my code to run alternate statements, experiment, etc.).
Re: Batches of single-insert statements vs batches of multi-insert statements
>Or have I missed something?
Re: Batches of single-insert statements vs batches of multi-insert statements
Believe me or not, pgjdbc has already that feature implemented.
Next exception: "VALUES lists must all be the same length".
Funny enough, if I change the name of the table, it seems to work. I used the convention that we have in our production code.
Вложения
Re: Batches of single-insert statements vs batches of multi-insert statements
Vladimir Sitnikov schrieb am 05.06.2016 um 20:33: > The feature is disabled by default (so everybody can give it a try before we make it default), and we plan to activateit by default. > To activate multi-inserts, you should add reWriteBatchedInserts=true connection parameter. > You need 9.4.1209-SNAPSHOT version (you can grab that from Maven Central or build from sources). Is there a reason why that build is not available on https://jdbc.postgresql.org/download.html ?
Re: Re: Batches of single-insert statements vs batches of multi-insert statements
Vladimir Sitnikov schrieb am 05.06.2016 um 20:33:The feature is disabled by default (so everybody can give it a try before we make it default), and we plan to activate it by default.
To activate multi-inserts, you should add reWriteBatchedInserts=true connection parameter.
You need 9.4.1209-SNAPSHOT version (you can grab that from Maven Central or build from sources).
Is there a reason why that build is not available on https://jdbc.postgresql.org/download.html ?
--
Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-jdbc
Re: Batches of single-insert statements vs batches ofmulti-insert statements
>Or have I missed something?
Re: Batches of single-insert statements vs batches ofmulti-insert statements
Believe me or not, pgjdbc has already that feature implemented.
Next exception: "VALUES lists must all be the same length".
Funny enough, if I change the name of the table, it seems to work. I used the convention that we have in our production code.
Вложения
Re: Batches of single-insert statements vs batches of multi-insertstatements
Vladimir Sitnikov schrieb am 05.06.2016 um 20:33: > The feature is disabled by default (so everybody can give it a try before we make it default), and we plan to activateit by default. > To activate multi-inserts, you should add reWriteBatchedInserts=true connection parameter. > You need 9.4.1209-SNAPSHOT version (you can grab that from Maven Central or build from sources). Is there a reason why that build is not available on https://jdbc.postgresql.org/download.html ?
Re: Re: Batches of single-insert statements vs batches ofmulti-insert statements
Vladimir Sitnikov schrieb am 05.06.2016 um 20:33:The feature is disabled by default (so everybody can give it a try before we make it default), and we plan to activate it by default.
To activate multi-inserts, you should add reWriteBatchedInserts=true connection parameter.
You need 9.4.1209-SNAPSHOT version (you can grab that from Maven Central or build from sources).
Is there a reason why that build is not available on https://jdbc.postgresql.org/download.html ?
--
Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-jdbc
PrepareStatement.execute() blocked because of long time 'create index' operation, connection leak
PrepareStatement.execute() blocked because of long time 'create index' operation, connection leak
Hi,I use jdbc to create index on a table, the sql of 'create index' cost about 30 minutes.Finally I found, the jdbc client thread blocked at PrepareStatement.execute(), not finished.At server side, query pg_stat_activity, the connection state already be ‘idle’ (index creation finished).Is there any suggestion? How to deal with the client thread blocking (read for a response message?) and connection leak.Thanks a lot.
Re: PrepareStatement.execute() blocked because of long time 'create index' operation, connection leak
Dave>You may have to execute the execute() in a background thread to not block the main thread
Vladimir,Thanks :)I think there’s no firewall in our test environments. However, there’s network control software at client machine.The network control software occasionally lost the connection.My question is, if the network connection lost, then jdbc client should return an IOException immediately?OR blocked forever until TCP connection killed by operating system?
在 2016年6月7日,19:37,Vladimir Sitnikov <sitnikov.vladimir@gmail.com> 写道:Jingzhi>I use jdbc to create index on a table, the sql of 'create index' cost about 30 minutes.Jingzhi>Finally I found, the jdbc client thread blocked at PrepareStatement.execute(), not finished.Jingzhi>At server side, query pg_stat_activity, the connection state already be ‘idle’ (index creation finished).Jingzhi, Do you have a firewall in between app and the database?Can it terminate the connection while index is being built?
Dave>You may have to execute the execute() in a background thread to not block the main threadDave, I'm afraid it looks like we need keep-alive messages for such long-running transactions.There's tcpKeepAlive, however it has no way to set specific timeout value.It could help to detect "broken connection" at java side, though.It boils down to plug&pray kind of solution, so everybody should ensure there's no firewall that could kill long-lasting TCP connections.What if we could teach pgjdbc to send a dummy command once a while?The drawback is it could fill up buffer at the backend side, however sending a sync message once every 10-15 minutes sounds quite innocent.Vladimir
Vladimir,Thanks :)I think there’s no firewall in our test environments. However, there’s network control software at client machine.The network control software occasionally lost the connection.My question is, if the network connection lost, then jdbc client should return an IOException immediately?OR blocked forever until TCP connection killed by operating system?Well the problem is we don't know that the connection has failed until the TCP connection has been killed by the O/S.Vladimir is proposing keep alive messages so that either it won't fail or we will know about it sooner if it does.
Re: PrepareStatement.execute() blocked because of long time 'create index' operation, connection leak
Hi.
But do we already use SO_KEEPALIVE on the socket layer?
Also I am not really sure socket was closed on the server end. TS said he could see server backend in the idle state. Also backed server did not fail, so unless there were some network glitch we should have received tcp reset.
BTW: Are there any stateful firewalls or NATs between server in client?
Best regards, Vitalii Tymchyshyn
On 7 June 2016 at 08:01, jingzhi.zhang@outlook.com <jingzhi.zhang@outlook.com> wrote:Vladimir,Thanks :)I think there’s no firewall in our test environments. However, there’s network control software at client machine.The network control software occasionally lost the connection.My question is, if the network connection lost, then jdbc client should return an IOException immediately?OR blocked forever until TCP connection killed by operating system?Well the problem is we don't know that the connection has failed until the TCP connection has been killed by the O/S.Vladimir is proposing keep alive messages so that either it won't fail or we will know about it sooner if it does.在 2016年6月7日,19:37,Vladimir Sitnikov <sitnikov.vladimir@gmail.com> 写道:Jingzhi>I use jdbc to create index on a table, the sql of 'create index' cost about 30 minutes.Jingzhi>Finally I found, the jdbc client thread blocked at PrepareStatement.execute(), not finished.Jingzhi>At server side, query pg_stat_activity, the connection state already be ‘idle’ (index creation finished).Jingzhi, Do you have a firewall in between app and the database?Can it terminate the connection while index is being built?
Dave>You may have to execute the execute() in a background thread to not block the main threadDave, I'm afraid it looks like we need keep-alive messages for such long-running transactions.There's tcpKeepAlive, however it has no way to set specific timeout value.It could help to detect "broken connection" at java side, though.It boils down to plug&pray kind of solution, so everybody should ensure there's no firewall that could kill long-lasting TCP connections.What if we could teach pgjdbc to send a dummy command once a while?The drawback is it could fill up buffer at the backend side, however sending a sync message once every 10-15 minutes sounds quite innocent.Vladimir
Re: PrepareStatement.execute() blocked because of long time 'create index' operation, connection leak
Hi,I use jdbc to create index on a table, the sql of 'create index' cost about 30 minutes.Finally I found, the jdbc client thread blocked at PrepareStatement.execute(), not finished.At server side, query pg_stat_activity, the connection state already be ‘idle’ (index creation finished).Is there any suggestion? How to deal with the client thread blocking (read for a response message?) and connection leak.Thanks a lot.
Re: PrepareStatement.execute() blocked because of long time 'create index' operation, connection leak
在 2016年6月7日,19:21,Dave Cramer <pg@fastcrypt.com> 写道:There is a cancel statement, but I presume you want the create index to succeed. You may have to execute the execute() in a background thread to not block the main threadOn 6 June 2016 at 12:25, jingzhi.zhang@outlook.com <jingzhi.zhang@outlook.com> wrote:Hi,I use jdbc to create index on a table, the sql of 'create index' cost about 30 minutes.Finally I found, the jdbc client thread blocked at PrepareStatement.execute(), not finished.At server side, query pg_stat_activity, the connection state already be ‘idle’ (index creation finished).Is there any suggestion? How to deal with the client thread blocking (read for a response message?) and connection leak.Thanks a lot.
Re: PrepareStatement.execute() blocked because of long time'create index' operation, connection leak
Dave>You may have to execute the execute() in a background thread to not block the main thread
Re: PrepareStatement.execute() blocked because of long time 'create index' operation, connection leak
在 2016年6月7日,19:37,Vladimir Sitnikov <sitnikov.vladimir@gmail.com> 写道:Jingzhi>I use jdbc to create index on a table, the sql of 'create index' cost about 30 minutes.Jingzhi>Finally I found, the jdbc client thread blocked at PrepareStatement.execute(), not finished.Jingzhi>At server side, query pg_stat_activity, the connection state already be ‘idle’ (index creation finished).Jingzhi, Do you have a firewall in between app and the database?Can it terminate the connection while index is being built?
Dave>You may have to execute the execute() in a background thread to not block the main threadDave, I'm afraid it looks like we need keep-alive messages for such long-running transactions.There's tcpKeepAlive, however it has no way to set specific timeout value.It could help to detect "broken connection" at java side, though.It boils down to plug&pray kind of solution, so everybody should ensure there's no firewall that could kill long-lasting TCP connections.What if we could teach pgjdbc to send a dummy command once a while?The drawback is it could fill up buffer at the backend side, however sending a sync message once every 10-15 minutes sounds quite innocent.Vladimir
Vladimir,Thanks :)I think there’s no firewall in our test environments. However, there’s network control software at client machine.The network control software occasionally lost the connection.My question is, if the network connection lost, then jdbc client should return an IOException immediately?OR blocked forever until TCP connection killed by operating system?
在 2016年6月7日,19:37,Vladimir Sitnikov <sitnikov.vladimir@gmail.com> 写道:Jingzhi>I use jdbc to create index on a table, the sql of 'create index' cost about 30 minutes.Jingzhi>Finally I found, the jdbc client thread blocked at PrepareStatement.execute(), not finished.Jingzhi>At server side, query pg_stat_activity, the connection state already be ‘idle’ (index creation finished).Jingzhi, Do you have a firewall in between app and the database?Can it terminate the connection while index is being built?
Dave>You may have to execute the execute() in a background thread to not block the main threadDave, I'm afraid it looks like we need keep-alive messages for such long-running transactions.There's tcpKeepAlive, however it has no way to set specific timeout value.It could help to detect "broken connection" at java side, though.It boils down to plug&pray kind of solution, so everybody should ensure there's no firewall that could kill long-lasting TCP connections.What if we could teach pgjdbc to send a dummy command once a while?The drawback is it could fill up buffer at the backend side, however sending a sync message once every 10-15 minutes sounds quite innocent.Vladimir
Re: PrepareStatement.execute() blocked because of long time'create index' operation, connection leak
Vladimir,Thanks :)I think there’s no firewall in our test environments. However, there’s network control software at client machine.The network control software occasionally lost the connection.My question is, if the network connection lost, then jdbc client should return an IOException immediately?OR blocked forever until TCP connection killed by operating system?Well the problem is we don't know that the connection has failed until the TCP connection has been killed by the O/S.Vladimir is proposing keep alive messages so that either it won't fail or we will know about it sooner if it does.
Re: PrepareStatement.execute() blocked because of long time'create index' operation, connection leak
Hi.
But do we already use SO_KEEPALIVE on the socket layer?
Also I am not really sure socket was closed on the server end. TS said he could see server backend in the idle state. Also backed server did not fail, so unless there were some network glitch we should have received tcp reset.
BTW: Are there any stateful firewalls or NATs between server in client?
Best regards, Vitalii Tymchyshyn
On 7 June 2016 at 08:01, jingzhi.zhang@outlook.com <jingzhi.zhang@outlook.com> wrote:Vladimir,Thanks :)I think there’s no firewall in our test environments. However, there’s network control software at client machine.The network control software occasionally lost the connection.My question is, if the network connection lost, then jdbc client should return an IOException immediately?OR blocked forever until TCP connection killed by operating system?Well the problem is we don't know that the connection has failed until the TCP connection has been killed by the O/S.Vladimir is proposing keep alive messages so that either it won't fail or we will know about it sooner if it does.在 2016年6月7日,19:37,Vladimir Sitnikov <sitnikov.vladimir@gmail.com> 写道:Jingzhi>I use jdbc to create index on a table, the sql of 'create index' cost about 30 minutes.Jingzhi>Finally I found, the jdbc client thread blocked at PrepareStatement.execute(), not finished.Jingzhi>At server side, query pg_stat_activity, the connection state already be ‘idle’ (index creation finished).Jingzhi, Do you have a firewall in between app and the database?Can it terminate the connection while index is being built?
Dave>You may have to execute the execute() in a background thread to not block the main threadDave, I'm afraid it looks like we need keep-alive messages for such long-running transactions.There's tcpKeepAlive, however it has no way to set specific timeout value.It could help to detect "broken connection" at java side, though.It boils down to plug&pray kind of solution, so everybody should ensure there's no firewall that could kill long-lasting TCP connections.What if we could teach pgjdbc to send a dummy command once a while?The drawback is it could fill up buffer at the backend side, however sending a sync message once every 10-15 minutes sounds quite innocent.Vladimir
Re: PrepareStatement.execute() blocked because of long time'create index' operation, connection leak
Re: PrepareStatement.execute() blocked because of long time 'create index' operation, connection leak
在 2016年6月7日,20:31,Sehrope Sarkuni <sehrope@jackdb.com> 写道:Vladimir,Thanks :)I think there’s no firewall in our test environments. However, there’s network control software at client machine.The network control software occasionally lost the connection.My question is, if the network connection lost, then jdbc client should return an IOException immediately?OR blocked forever until TCP connection killed by operating system?Well the problem is we don't know that the connection has failed until the TCP connection has been killed by the O/S.Vladimir is proposing keep alive messages so that either it won't fail or we will know about it sooner if it does.I don't think it's a good idea to flood the server with dummy messages for the purposes of keeping the connection alive. Besides the usual problems of filling up the send buffer, you'd also have to have it done in a separate thread as the usual sender is already blocked waiting for the command response.This an OS issue, not a driver issue. The correct thing to do is to enable TCP keep alives at the OS level. Here's some info for doing so on Linux: http://tldp.org/HOWTO/TCP-Keepalive-HOWTO/usingkeepalive.htmlThe defaults for TCP keep alives are very high (I think two-hours before the first attempt) so it's a good idea to reduce them if you're dealing with long running WAN connections. Also, from personal experience it's usually residential networking devices (i.e. your WiFi router) that drop inactive NATed connections. This is less of an issue in a server-to-server environment in a data center. Either way, bumping up the TCP keep alives works in both situations.
Re: Batches of single-insert statements vs batches of multi-insert statements
I suggest you avoid trying to hard coding data values in the statement sql. Instead use parameter place-holders (?) and use the API to bind the value. You will find it works much better for you. Like this ....
String insert = "INSERT INTO T_AbCd (T_AbCd_ID, SomeDate, ASmallInt) VALUES (?, ?, ?)";
PreparedStatement pst = conn.prepareStatement(insert, ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY);
pst.setLong(1, 24202712L);
pst.setDate(2, new java.sql.Date(2014-1900, 3-1, 21));
pst.setInt(3, 0);
pst.addBatch();
pst.setLong(1, 24202713L);
pst.setDate(2, new java.sql.Date(2014-1900, 3-1, 22));
pst.setInt(3, 0);
pst.addBatch();
pst.executeBatch();
The reason is the batched re-write feature builds dynamically the sql issued to the back end. Using parameter place-holders. So you can see hard coding data values isn't going to work without some horribly complex statement parsing added to the driver. This time parsing data and recognizing all data values and type permutations. yikes
I'd say you have these options. If you want to enable batched re-writing.
a) Change your code using the same style of coding as the above example. Re-deploying to production.
b) If re-deploy to production isn't possible in the short term try using Byteman [1]. Byteman will unobtrusively mend your existing code. See the attached Byteman rule. The attached rule fixes your test class.
c) Provide a patch to add implement sql statement parsing support for hard coded data. I see you have already started to try this [2].
Regards,
Jeremy
[1] http://byteman.jboss.org/
[2] https://github.com/pgjdbc/pgjdbc/pull/580
On 05/06/16 21:39, Christopher Deckers wrote:
Hi Vladimir,Thanks for your answer! It explains well the current situation.Believe me or not, pgjdbc has already that feature implemented.Oh good! But it definitely needs more testing and fixing :)I got: "Batch entry 0 INSERT INTO XXXX".
Next exception: "VALUES lists must all be the same length".I narrowed down the issue to a simple test case, see attached file.
Funny enough, if I change the name of the table, it seems to work. I used the convention that we have in our production code.Note that this test case does not produce any error if I remove the "reWriteBatchedInserts=true" parameter.Please let me know if I can be of any help!-Christopher
-- Jeremy Whiting Senior Software Engineer, Middleware Performance Team Red Hat ------------------------------------------------------------ Registered Address: Red Hat UK Ltd, 64 Baker Street, 4th Floor, London. W1U 7DF. United Kingdom. Registered in England and Wales under Company Registration No. 03798903. Directors: Directors:Michael Cunningham (US), Michael O'Neill(Ireland), Eric Shander (US)
Вложения
Re: Batches of single-insert statements vs batches of multi-insert statements
I suggest you avoid trying to hard coding data values in the statement sql.
Note that from a theoretical point of view, if I were to add parameters for fixed values, then I would send through the wire as many times this parameter, while if it is part of the statement the driver can decide to only send the dynamic parameters. Also, we do have external multi-insert logic for that specific part of the code, and it also improves speed with jTDS; but SQL Server is limited to 2000 dynamic parameters, so if we waste those parameters with static values, less batches can go through at once.
c) Provide a patch to add implement sql statement parsing support for hard coded data. I see you have already started to try this [2].
Re: Batches of single-insert statements vs batches ofmulti-insert statements
I suggest you avoid trying to hard coding data values in the statement sql. Instead use parameter place-holders (?) and use the API to bind the value. You will find it works much better for you. Like this ....
String insert = "INSERT INTO T_AbCd (T_AbCd_ID, SomeDate, ASmallInt) VALUES (?, ?, ?)";
PreparedStatement pst = conn.prepareStatement(insert, ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY);
pst.setLong(1, 24202712L);
pst.setDate(2, new java.sql.Date(2014-1900, 3-1, 21));
pst.setInt(3, 0);
pst.addBatch();
pst.setLong(1, 24202713L);
pst.setDate(2, new java.sql.Date(2014-1900, 3-1, 22));
pst.setInt(3, 0);
pst.addBatch();
pst.executeBatch();
The reason is the batched re-write feature builds dynamically the sql issued to the back end. Using parameter place-holders. So you can see hard coding data values isn't going to work without some horribly complex statement parsing added to the driver. This time parsing data and recognizing all data values and type permutations. yikes
I'd say you have these options. If you want to enable batched re-writing.
a) Change your code using the same style of coding as the above example. Re-deploying to production.
b) If re-deploy to production isn't possible in the short term try using Byteman [1]. Byteman will unobtrusively mend your existing code. See the attached Byteman rule. The attached rule fixes your test class.
c) Provide a patch to add implement sql statement parsing support for hard coded data. I see you have already started to try this [2].
Regards,
Jeremy
[1] http://byteman.jboss.org/
[2] https://github.com/pgjdbc/pgjdbc/pull/580
On 05/06/16 21:39, Christopher Deckers wrote:
Hi Vladimir,Thanks for your answer! It explains well the current situation.Believe me or not, pgjdbc has already that feature implemented.Oh good! But it definitely needs more testing and fixing :)I got: "Batch entry 0 INSERT INTO XXXX".
Next exception: "VALUES lists must all be the same length".I narrowed down the issue to a simple test case, see attached file.
Funny enough, if I change the name of the table, it seems to work. I used the convention that we have in our production code.Note that this test case does not produce any error if I remove the "reWriteBatchedInserts=true" parameter.Please let me know if I can be of any help!-Christopher
-- Jeremy Whiting Senior Software Engineer, Middleware Performance Team Red Hat ------------------------------------------------------------ Registered Address: Red Hat UK Ltd, 64 Baker Street, 4th Floor, London. W1U 7DF. United Kingdom. Registered in England and Wales under Company Registration No. 03798903. Directors: Directors:Michael Cunningham (US), Michael O'Neill(Ireland), Eric Shander (US)
Вложения
Re: Batches of single-insert statements vs batches ofmulti-insert statements
I suggest you avoid trying to hard coding data values in the statement sql.
Note that from a theoretical point of view, if I were to add parameters for fixed values, then I would send through the wire as many times this parameter, while if it is part of the statement the driver can decide to only send the dynamic parameters. Also, we do have external multi-insert logic for that specific part of the code, and it also improves speed with jTDS; but SQL Server is limited to 2000 dynamic parameters, so if we waste those parameters with static values, less batches can go through at once.
c) Provide a patch to add implement sql statement parsing support for hard coded data. I see you have already started to try this [2].
Re: PrepareStatement.execute() blocked because of long time 'create index' operation, connection leak
在 2016年6月7日,19:21,Dave Cramer <pg@fastcrypt.com> 写道:There is a cancel statement, but I presume you want the create index to succeed. You may have to execute the execute() in a background thread to not block the main threadOn 6 June 2016 at 12:25, jingzhi.zhang@outlook.com <jingzhi.zhang@outlook.com> wrote:Hi,I use jdbc to create index on a table, the sql of 'create index' cost about 30 minutes.Finally I found, the jdbc client thread blocked at PrepareStatement.execute(), not finished.At server side, query pg_stat_activity, the connection state already be ‘idle’ (index creation finished).Is there any suggestion? How to deal with the client thread blocking (read for a response message?) and connection leak.Thanks a lot.
Re: PrepareStatement.execute() blocked because of long time 'create index' operation, connection leak
在 2016年6月7日,20:31,Sehrope Sarkuni <sehrope@jackdb.com> 写道:Vladimir,Thanks :)I think there’s no firewall in our test environments. However, there’s network control software at client machine.The network control software occasionally lost the connection.My question is, if the network connection lost, then jdbc client should return an IOException immediately?OR blocked forever until TCP connection killed by operating system?Well the problem is we don't know that the connection has failed until the TCP connection has been killed by the O/S.Vladimir is proposing keep alive messages so that either it won't fail or we will know about it sooner if it does.I don't think it's a good idea to flood the server with dummy messages for the purposes of keeping the connection alive. Besides the usual problems of filling up the send buffer, you'd also have to have it done in a separate thread as the usual sender is already blocked waiting for the command response.This an OS issue, not a driver issue. The correct thing to do is to enable TCP keep alives at the OS level. Here's some info for doing so on Linux: http://tldp.org/HOWTO/TCP-Keepalive-HOWTO/usingkeepalive.htmlThe defaults for TCP keep alives are very high (I think two-hours before the first attempt) so it's a good idea to reduce them if you're dealing with long running WAN connections. Also, from personal experience it's usually residential networking devices (i.e. your WiFi router) that drop inactive NATed connections. This is less of an issue in a server-to-server environment in a data center. Either way, bumping up the TCP keep alives works in both situations.
Re: PrepareStatement.execute() blocked because of long time 'create index' operation, connection leak
在 2016年6月7日,19:37,Vladimir Sitnikov <sitnikov.vladimir@gmail.com> 写道:Jingzhi>I use jdbc to create index on a table, the sql of 'create index' cost about 30 minutes.Jingzhi>Finally I found, the jdbc client thread blocked at PrepareStatement.execute(), not finished.Jingzhi>At server side, query pg_stat_activity, the connection state already be ‘idle’ (index creation finished).Jingzhi, Do you have a firewall in between app and the database?Can it terminate the connection while index is being built?
Dave>You may have to execute the execute() in a background thread to not block the main threadDave, I'm afraid it looks like we need keep-alive messages for such long-running transactions.There's tcpKeepAlive, however it has no way to set specific timeout value.It could help to detect "broken connection" at java side, though.It boils down to plug&pray kind of solution, so everybody should ensure there's no firewall that could kill long-lasting TCP connections.What if we could teach pgjdbc to send a dummy command once a while?The drawback is it could fill up buffer at the backend side, however sending a sync message once every 10-15 minutes sounds quite innocent.Vladimir