Обсуждение: Batches of single-insert statements vs batches of multi-insert statements

Поиск

Список

Период

Сортировка

Batches of single-insert statements vs batches of multi-insert statements

От

Christopher Deckers

Дата:

05 июня 2016 г., 21:18:03

Hi all,

I tried all sorts of query rewrite to boost a huge sequence of insert statements and here are my findings. All is run on a local Postgres 9.5.3, on a 64-bit Windows box with 32GB RAM, i7@3.6GHz.

1. Batches of single insert statements (original case):
Batch of: INSERT INTO SomeTable (Col1, Col2) VALUES (?, ?)
-> 52 seconds

2. Batches of single insert statements to an unlogged table, then copy:
CREATE UNLOGGED TABLE SomeUnloggedTable AS SELECT * FROM SomeTable WHERE 1 = 0
Batch of: INSERT INTO SomeUnloggedTable (Col1, Col2) VALUES (?, ?)
INSERT INTO SomeTable SELECT * FROM SomeUnloggedTable
DROP TABLE SomeUnloggedTable
-> 43 seconds (15 seconds in the copy from unlogged to target table)

3. Batches of 100 multi-insert (followed by batches of single inserts for remainder):
Batch of: INSERT INTO SomeTable (Col1, Col2) VALUES (?, ?), (?, ?), (?, ?)...
-> 30 seconds

So, rewriting the batch of single insert statements to batches of 100 multi-insert statements almost doubled the speed.

Could someone educate me as to what happens internally to the JDBC single-insert batch that makes it so different from a multi-insert batch?

Rewriting our code to use multi-value inserts is more cumbersome and not applicable everywhere. It also makes the code database-specific as all the engines we support do not all handle this syntax.
These considerations actually lead me to this old thread:
https://www.postgresql.org/message-id/55130DC8.2070508%40redhat.com

Anyway, these tests definitely show that the performance of the standard JDBC batch API could potentially be improved in Postgres internally with a bit of magic :)
Isn't it possible (either in the driver or in the engine) to aggregate the single-insert statements of a batch to turn them into a multi-insert equivalent? It would not have to handle all cases: simply handling basic commands like "INSERT INTO X (a, b, c, ...) VALUES (?, ?, ?, ...)" would help a lot.
If this should not be done in the JDBC driver, then in which area could such handling be done?
Or have I missed something?

If needed, I am ready to offer my assistance with testing (I can tweak parameters, install dev drivers, modify my code to run alternate statements, experiment, etc.).

Cheers,

-Christopher

Re: Batches of single-insert statements vs batches of multi-insert statements

От

Vladimir Sitnikov

Дата:

05 июня 2016 г., 21:34:01

Hi Christopher,

>Could someone educate me as to what happens internally to the JDBC single-insert batch that makes it so different from a multi-insert batch?

There are significant overheads at the database level itself to initialize and cleanup resources required for a "insert values" statement.

This is a known issue for backend developers, however it does not seem to have easy solution.

See https://www.postgresql.org/message-id/CAB=Je-EDZwVH7FgpDOa0ad8a7a6D=1QoYmvW2kQqSAML2gf+Uw@mail.gmail.com

>If this should not be done in the JDBC driver, then in which area could such handling be done?
>Or have I missed something?

Believe me or not, pgjdbc has already that feature implemented.

Jeremy did that in https://github.com/pgjdbc/pgjdbc/pull/491

The feature is disabled by default (so everybody can give it a try before we make it default), and we plan to activate it by default.

To activate multi-inserts, you should add reWriteBatchedInserts=true connection parameter.

You need 9.4.1209-SNAPSHOT version (you can grab that from Maven Central or build from sources).

We all would be grateful if you could test it and let us know if it works for you.

Vladimir

Re: Batches of single-insert statements vs batches of multi-insert statements

От

Christopher Deckers

Дата:

05 июня 2016 г., 23:40:08

Hi Vladimir,

Thanks for your answer! It explains well the current situation.

Believe me or not, pgjdbc has already that feature implemented.

Oh good! But it definitely needs more testing and fixing :)

I got: "Batch entry 0 INSERT INTO XXXX".
Next exception: "VALUES lists must all be the same length".

I narrowed down the issue to a simple test case, see attached file.

Funny enough, if I change the name of the table, it seems to work. I used the convention that we have in our production code.

Note that this test case does not produce any error if I remove the "reWriteBatchedInserts=true" parameter.

Please let me know if I can be of any help!

-Christopher

Вложения

MultiInsertTest.java

Re: Batches of single-insert statements vs batches of multi-insert statements

От

Thomas Kellerer

Дата:

05 июня 2016 г., 23:48:37

Vladimir Sitnikov schrieb am 05.06.2016 um 20:33:
> The feature is disabled by default (so everybody can give it a try before we make it default), and we plan to
activateit by default. 
> To activate multi-inserts, you should add reWriteBatchedInserts=true connection parameter.
> You need 9.4.1209-SNAPSHOT version (you can grab that from Maven Central or build from sources).

Is there a reason why that build is not available on https://jdbc.postgresql.org/download.html ?

Re: Re: Batches of single-insert statements vs batches of multi-insert statements

От

Dave Cramer

Дата:

06 июня 2016 г., 00:18:38

It's a snapshot, not a release

Dave Cramer

davec@postgresintl.com

www.postgresintl.com

On 5 June 2016 at 16:48, Thomas Kellerer <spam_eater@gmx.net> wrote:

Vladimir Sitnikov schrieb am 05.06.2016 um 20:33:
The feature is disabled by default (so everybody can give it a try before we make it default), and we plan to activate it by default.
To activate multi-inserts, you should add reWriteBatchedInserts=true connection parameter.
You need 9.4.1209-SNAPSHOT version (you can grab that from Maven Central or build from sources).

Is there a reason why that build is not available on https://jdbc.postgresql.org/download.html ?

--
Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-jdbc

Re: Batches of single-insert statements vs batches ofmulti-insert statements

От

Vladimir Sitnikov

Дата:

06 июня 2016 г., 00:33:46

Hi Christopher,

>Could someone educate me as to what happens internally to the JDBC single-insert batch that makes it so different from a multi-insert batch?

There are significant overheads at the database level itself to initialize and cleanup resources required for a "insert values" statement.

This is a known issue for backend developers, however it does not seem to have easy solution.

See https://www.postgresql.org/message-id/CAB=Je-EDZwVH7FgpDOa0ad8a7a6D=1QoYmvW2kQqSAML2gf+Uw@mail.gmail.com

>If this should not be done in the JDBC driver, then in which area could such handling be done?
>Or have I missed something?

Believe me or not, pgjdbc has already that feature implemented.

Jeremy did that in https://github.com/pgjdbc/pgjdbc/pull/491

The feature is disabled by default (so everybody can give it a try before we make it default), and we plan to activate it by default.

To activate multi-inserts, you should add reWriteBatchedInserts=true connection parameter.

You need 9.4.1209-SNAPSHOT version (you can grab that from Maven Central or build from sources).

We all would be grateful if you could test it and let us know if it works for you.

Vladimir

Re: Batches of single-insert statements vs batches ofmulti-insert statements

От

Christopher Deckers

Дата:

06 июня 2016 г., 02:39:58

Hi Vladimir,

Thanks for your answer! It explains well the current situation.

Believe me or not, pgjdbc has already that feature implemented.

Oh good! But it definitely needs more testing and fixing :)

I got: "Batch entry 0 INSERT INTO XXXX".
Next exception: "VALUES lists must all be the same length".

I narrowed down the issue to a simple test case, see attached file.

Funny enough, if I change the name of the table, it seems to work. I used the convention that we have in our production code.

Note that this test case does not produce any error if I remove the "reWriteBatchedInserts=true" parameter.

Please let me know if I can be of any help!

-Christopher

Вложения

MultiInsertTest.java

Re: Batches of single-insert statements vs batches of multi-insertstatements

От

Thomas Kellerer

Дата:

06 июня 2016 г., 02:48:18

Vladimir Sitnikov schrieb am 05.06.2016 um 20:33:
> The feature is disabled by default (so everybody can give it a try before we make it default), and we plan to
activateit by default. 
> To activate multi-inserts, you should add reWriteBatchedInserts=true connection parameter.
> You need 9.4.1209-SNAPSHOT version (you can grab that from Maven Central or build from sources).

Is there a reason why that build is not available on https://jdbc.postgresql.org/download.html ?

Re: Re: Batches of single-insert statements vs batches ofmulti-insert statements

От

Dave Cramer

Дата:

06 июня 2016 г., 03:18:08

It's a snapshot, not a release

Dave Cramer

davec@postgresintl.com

www.postgresintl.com

On 5 June 2016 at 16:48, Thomas Kellerer <spam_eater@gmx.net> wrote:

Vladimir Sitnikov schrieb am 05.06.2016 um 20:33:
The feature is disabled by default (so everybody can give it a try before we make it default), and we plan to activate it by default.
To activate multi-inserts, you should add reWriteBatchedInserts=true connection parameter.
You need 9.4.1209-SNAPSHOT version (you can grab that from Maven Central or build from sources).

Is there a reason why that build is not available on https://jdbc.postgresql.org/download.html ?

--
Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-jdbc

PrepareStatement.execute() blocked because of long time 'create index' operation, connection leak

От

"jingzhi.zhang@outlook.com"

Дата:

06 июня 2016 г., 19:36:35

Hi,

I use jdbc to create index on a table, the sql of 'create index' cost about 30 minutes.

Finally I found, the jdbc client thread blocked at PrepareStatement.execute(), not finished.

At server side, query pg_stat_activity, the connection state already be ‘idle’ (index creation finished).

Is there any suggestion? How to deal with the client thread blocking (read for a response message?) and connection leak.

Thanks a lot.

PrepareStatement.execute() blocked because of long time 'create index' operation, connection leak

От

"jingzhi.zhang@outlook.com"

Дата:

06 июня 2016 г., 22:25:07

Hi,

I use jdbc to create index on a table, the sql of 'create index' cost about 30 minutes.

Finally I found, the jdbc client thread blocked at PrepareStatement.execute(), not finished.

At server side, query pg_stat_activity, the connection state already be ‘idle’ (index creation finished).

Is there any suggestion? How to deal with the client thread blocking (read for a response message?) and connection leak.

Thanks a lot.

Re: PrepareStatement.execute() blocked because of long time 'create index' operation, connection leak

От

Dave Cramer

Дата:

07 июня 2016 г., 14:21:35

There is a cancel statement, but I presume you want the create index to succeed. You may have to execute the execute() in a background thread to not block the main thread

Dave Cramer

davec@postgresintl.com

www.postgresintl.com

On 6 June 2016 at 12:25, jingzhi.zhang@outlook.com <jingzhi.zhang@outlook.com> wrote:

Hi,

I use jdbc to create index on a table, the sql of 'create index' cost about 30 minutes.
Finally I found, the jdbc client thread blocked at PrepareStatement.execute(), not finished.
At server side, query pg_stat_activity, the connection state already be ‘idle’ (index creation finished).

Is there any suggestion? How to deal with the client thread blocking (read for a response message?) and connection leak.

Thanks a lot.

Re: PrepareStatement.execute() blocked because of long time 'create index' operation, connection leak

От

Vladimir Sitnikov

Дата:

07 июня 2016 г., 14:37:55

Jingzhi>I use jdbc to create index on a table, the sql of 'create index' cost about 30 minutes.

Jingzhi>Finally I found, the jdbc client thread blocked at PrepareStatement.execute(), not finished.

Jingzhi>At server side, query pg_stat_activity, the connection state already be ‘idle’ (index creation finished).

Jingzhi, Do you have a firewall in between app and the database?

Can it terminate the connection while index is being built?

Dave>You may have to execute the execute() in a background thread to not block the main thread

Dave, I'm afraid it looks like we need keep-alive messages for such long-running transactions.

There's tcpKeepAlive, however it has no way to set specific timeout value.

It could help to detect "broken connection" at java side, though.

It boils down to plug&pray kind of solution, so everybody should ensure there's no firewall that could kill long-lasting TCP connections.

What if we could teach pgjdbc to send a dummy command once a while?

The drawback is it could fill up buffer at the backend side, however sending a sync message once every 10-15 minutes sounds quite innocent.

Vladimir

Re: PrepareStatement.execute() blocked because of long time 'create index' operation, connection leak

От

Dave Cramer

Дата:

07 июня 2016 г., 15:17:31

On 7 June 2016 at 08:01, jingzhi.zhang@outlook.com <jingzhi.zhang@outlook.com> wrote:

Vladimir,

Thanks :)

I think there’s no firewall in our test environments. However, there’s network control software at client machine.
The network control software occasionally lost the connection.

My question is, if the network connection lost, then jdbc client should return an IOException immediately?
OR blocked forever until TCP connection killed by operating system?

Well the problem is we don't know that the connection has failed until the TCP connection has been killed by the O/S.

Vladimir is proposing keep alive messages so that either it won't fail or we will know about it sooner if it does.

Dave Cramer

davec@postgresintl.com

www.postgresintl.com

在 2016年6月7日，19:37，Vladimir Sitnikov <sitnikov.vladimir@gmail.com> 写道：

Jingzhi>I use jdbc to create index on a table, the sql of 'create index' cost about 30 minutes.
Jingzhi>Finally I found, the jdbc client thread blocked at PrepareStatement.execute(), not finished.
Jingzhi>At server side, query pg_stat_activity, the connection state already be ‘idle’ (index creation finished).

Jingzhi, Do you have a firewall in between app and the database?
Can it terminate the connection while index is being built?

Dave>You may have to execute the execute() in a background thread to not block the main thread

Dave, I'm afraid it looks like we need keep-alive messages for such long-running transactions.
There's tcpKeepAlive, however it has no way to set specific timeout value.
It could help to detect "broken connection" at java side, though.

It boils down to plug&pray kind of solution, so everybody should ensure there's no firewall that could kill long-lasting TCP connections.

What if we could teach pgjdbc to send a dummy command once a while?
The drawback is it could fill up buffer at the backend side, however sending a sync message once every 10-15 minutes sounds quite innocent.

Vladimir

Re: PrepareStatement.execute() blocked because of long time 'create index' operation, connection leak

От

Sehrope Sarkuni

Дата:

07 июня 2016 г., 15:31:11

On Tue, Jun 7, 2016 at 8:17 AM, Dave Cramer <pg@fastcrypt.com> wrote:

On 7 June 2016 at 08:01, jingzhi.zhang@outlook.com <jingzhi.zhang@outlook.com> wrote:
Vladimir,

Thanks :)

I think there’s no firewall in our test environments. However, there’s network control software at client machine.
The network control software occasionally lost the connection.

My question is, if the network connection lost, then jdbc client should return an IOException immediately?
OR blocked forever until TCP connection killed by operating system?

Well the problem is we don't know that the connection has failed until the TCP connection has been killed by the O/S.

Vladimir is proposing keep alive messages so that either it won't fail or we will know about it sooner if it does.

I don't think it's a good idea to flood the server with dummy messages for the purposes of keeping the connection alive. Besides the usual problems of filling up the send buffer, you'd also have to have it done in a separate thread as the usual sender is already blocked waiting for the command response.

This an OS issue, not a driver issue. The correct thing to do is to enable TCP keep alives at the OS level. Here's some info for doing so on Linux: http://tldp.org/HOWTO/TCP-Keepalive-HOWTO/usingkeepalive.html

The defaults for TCP keep alives are very high (I think two-hours before the first attempt) so it's a good idea to reduce them if you're dealing with long running WAN connections. Also, from personal experience it's usually residential networking devices (i.e. your WiFi router) that drop inactive NATed connections. This is less of an issue in a server-to-server environment in a data center. Either way, bumping up the TCP keep alives works in both situations.

Regards,

-- Sehrope Sarkuni

Founder & CEO | JackDB, Inc. | https://www.jackdb.com/

Re: PrepareStatement.execute() blocked because of long time 'create index' operation, connection leak

От

Vitalii Tymchyshyn

Дата:

07 июня 2016 г., 15:34:46

Hi.

But do we already use SO_KEEPALIVE on the socket layer?
Also I am not really sure socket was closed on the server end. TS said he could see server backend in the idle state. Also backed server did not fail, so unless there were some network glitch we should have received tcp reset.
BTW: Are there any stateful firewalls or NATs between server in client?

Best regards, Vitalii Tymchyshyn

Вт, 7 черв. 2016 08:17 користувач Dave Cramer <pg@fastcrypt.com> пише:

On 7 June 2016 at 08:01, jingzhi.zhang@outlook.com <jingzhi.zhang@outlook.com> wrote:
Vladimir,

Thanks :)

I think there’s no firewall in our test environments. However, there’s network control software at client machine.
The network control software occasionally lost the connection.

My question is, if the network connection lost, then jdbc client should return an IOException immediately?
OR blocked forever until TCP connection killed by operating system?

Well the problem is we don't know that the connection has failed until the TCP connection has been killed by the O/S.

Vladimir is proposing keep alive messages so that either it won't fail or we will know about it sooner if it does.

Dave Cramer

davec@postgresintl.com
www.postgresintl.com

在 2016年6月7日，19:37，Vladimir Sitnikov <sitnikov.vladimir@gmail.com> 写道：

Jingzhi>I use jdbc to create index on a table, the sql of 'create index' cost about 30 minutes.
Jingzhi>Finally I found, the jdbc client thread blocked at PrepareStatement.execute(), not finished.
Jingzhi>At server side, query pg_stat_activity, the connection state already be ‘idle’ (index creation finished).

Jingzhi, Do you have a firewall in between app and the database?
Can it terminate the connection while index is being built?

Dave>You may have to execute the execute() in a background thread to not block the main thread

Dave, I'm afraid it looks like we need keep-alive messages for such long-running transactions.
There's tcpKeepAlive, however it has no way to set specific timeout value.
It could help to detect "broken connection" at java side, though.

It boils down to plug&pray kind of solution, so everybody should ensure there's no firewall that could kill long-lasting TCP connections.

What if we could teach pgjdbc to send a dummy command once a while?
The drawback is it could fill up buffer at the backend side, however sending a sync message once every 10-15 minutes sounds quite innocent.

Vladimir

Re: PrepareStatement.execute() blocked because of long time 'create index' operation, connection leak

От

Vladimir Sitnikov

Дата:

07 июня 2016 г., 15:45:00

Vitalii>But do we already use SO_KEEPALIVE on the socket layer?

Vitalii>Also I am not really sure socket was closed on the server end.

Vitalii> TS said he could see server backend in the idle state. Also backed server did not fail, so unless there were some network glitch we should have received tcp reset.

tcpKeepAlive connection setting is false by default. I think it's better to make "true" the default value.

Sehrope>I don't think it's a good idea to flood the server with dummy messages for the purposes of keeping the connection alive.

Sehrope>Besides the usual problems of filling up the send buffer, you'd also have to have it done in a separate thread

Sehrope>as the usual sender is already blocked waiting for the command response.

Sending a sync message once every 10m will hardly fill up the buffers.

On the other hand, it could make the connection more resilient.

Having several threads for (e.g. one for read and one for write) would resolve "deadlock out of buffers got full" issue, so it is good no matter if we add "ping" there.

Vladimir

Re: PrepareStatement.execute() blocked because of long time'create index' operation, connection leak

От

Dave Cramer

Дата:

07 июня 2016 г., 17:21:10

There is a cancel statement, but I presume you want the create index to succeed. You may have to execute the execute() in a background thread to not block the main thread

Dave Cramer

davec@postgresintl.com

www.postgresintl.com

On 6 June 2016 at 12:25, jingzhi.zhang@outlook.com <jingzhi.zhang@outlook.com> wrote:

Hi,

I use jdbc to create index on a table, the sql of 'create index' cost about 30 minutes.
Finally I found, the jdbc client thread blocked at PrepareStatement.execute(), not finished.
At server side, query pg_stat_activity, the connection state already be ‘idle’ (index creation finished).

Is there any suggestion? How to deal with the client thread blocking (read for a response message?) and connection leak.

Thanks a lot.

Re: PrepareStatement.execute() blocked because of long time 'create index' operation, connection leak

От

"jingzhi.zhang@outlook.com"

Дата:

07 июня 2016 г., 17:36:04

Hi Dave,

Thanks:)

But I need to clarify the question:

Because of the ‘create index’ sql statement execution (via jdbc) last nearly 30 minutes, the execute() method blocked forever.

At pg server side, about 30 minutes later, 'create index’ successfully finished. Then the connection state change to ‘idle’.

From jdbc client side, it seems ‘create index’ have not finished, but actually the operation already finished at server side.

This cause the client thread blocked and connection leak.

在 2016年6月7日，19:21，Dave Cramer <pg@fastcrypt.com> 写道：

There is a cancel statement, but I presume you want the create index to succeed. You may have to execute the execute() in a background thread to not block the main thread

Dave Cramer

davec@postgresintl.com
www.postgresintl.com

On 6 June 2016 at 12:25, jingzhi.zhang@outlook.com <jingzhi.zhang@outlook.com> wrote:
Hi,

I use jdbc to create index on a table, the sql of 'create index' cost about 30 minutes.
Finally I found, the jdbc client thread blocked at PrepareStatement.execute(), not finished.
At server side, query pg_stat_activity, the connection state already be ‘idle’ (index creation finished).

Is there any suggestion? How to deal with the client thread blocking (read for a response message?) and connection leak.

Thanks a lot.

Re: PrepareStatement.execute() blocked because of long time'create index' operation, connection leak

От

Vladimir Sitnikov

Дата:

07 июня 2016 г., 17:37:42

Jingzhi>I use jdbc to create index on a table, the sql of 'create index' cost about 30 minutes.

Jingzhi>Finally I found, the jdbc client thread blocked at PrepareStatement.execute(), not finished.

Jingzhi>At server side, query pg_stat_activity, the connection state already be ‘idle’ (index creation finished).

Jingzhi, Do you have a firewall in between app and the database?

Can it terminate the connection while index is being built?

Dave>You may have to execute the execute() in a background thread to not block the main thread

Dave, I'm afraid it looks like we need keep-alive messages for such long-running transactions.

There's tcpKeepAlive, however it has no way to set specific timeout value.

It could help to detect "broken connection" at java side, though.

It boils down to plug&pray kind of solution, so everybody should ensure there's no firewall that could kill long-lasting TCP connections.

What if we could teach pgjdbc to send a dummy command once a while?

The drawback is it could fill up buffer at the backend side, however sending a sync message once every 10-15 minutes sounds quite innocent.

Vladimir

Re: PrepareStatement.execute() blocked because of long time 'create index' operation, connection leak

От

"jingzhi.zhang@outlook.com"

Дата:

07 июня 2016 г., 18:01:06

Vladimir,

Thanks :)

I think there’s no firewall in our test environments. However, there’s network control software at client machine.

The network control software occasionally lost the connection.

My question is, if the network connection lost, then jdbc client should return an IOException immediately?

OR blocked forever until TCP connection killed by operating system?

在 2016年6月7日，19:37，Vladimir Sitnikov <sitnikov.vladimir@gmail.com> 写道：

Jingzhi>I use jdbc to create index on a table, the sql of 'create index' cost about 30 minutes.
Jingzhi>Finally I found, the jdbc client thread blocked at PrepareStatement.execute(), not finished.
Jingzhi>At server side, query pg_stat_activity, the connection state already be ‘idle’ (index creation finished).

Jingzhi, Do you have a firewall in between app and the database?
Can it terminate the connection while index is being built?

Dave>You may have to execute the execute() in a background thread to not block the main thread

Dave, I'm afraid it looks like we need keep-alive messages for such long-running transactions.
There's tcpKeepAlive, however it has no way to set specific timeout value.
It could help to detect "broken connection" at java side, though.

It boils down to plug&pray kind of solution, so everybody should ensure there's no firewall that could kill long-lasting TCP connections.

What if we could teach pgjdbc to send a dummy command once a while?
The drawback is it could fill up buffer at the backend side, however sending a sync message once every 10-15 minutes sounds quite innocent.

Vladimir

Re: PrepareStatement.execute() blocked because of long time'create index' operation, connection leak

От

Dave Cramer

Дата:

07 июня 2016 г., 18:17:01

On 7 June 2016 at 08:01, jingzhi.zhang@outlook.com <jingzhi.zhang@outlook.com> wrote:

Vladimir,

Thanks :)

I think there’s no firewall in our test environments. However, there’s network control software at client machine.
The network control software occasionally lost the connection.

My question is, if the network connection lost, then jdbc client should return an IOException immediately?
OR blocked forever until TCP connection killed by operating system?

Well the problem is we don't know that the connection has failed until the TCP connection has been killed by the O/S.

Vladimir is proposing keep alive messages so that either it won't fail or we will know about it sooner if it does.

Dave Cramer

davec@postgresintl.com

www.postgresintl.com

在 2016年6月7日，19:37，Vladimir Sitnikov <sitnikov.vladimir@gmail.com> 写道：

Jingzhi>I use jdbc to create index on a table, the sql of 'create index' cost about 30 minutes.
Jingzhi>Finally I found, the jdbc client thread blocked at PrepareStatement.execute(), not finished.
Jingzhi>At server side, query pg_stat_activity, the connection state already be ‘idle’ (index creation finished).

Jingzhi, Do you have a firewall in between app and the database?
Can it terminate the connection while index is being built?

Dave>You may have to execute the execute() in a background thread to not block the main thread

Dave, I'm afraid it looks like we need keep-alive messages for such long-running transactions.
There's tcpKeepAlive, however it has no way to set specific timeout value.
It could help to detect "broken connection" at java side, though.

It boils down to plug&pray kind of solution, so everybody should ensure there's no firewall that could kill long-lasting TCP connections.

What if we could teach pgjdbc to send a dummy command once a while?
The drawback is it could fill up buffer at the backend side, however sending a sync message once every 10-15 minutes sounds quite innocent.

Vladimir

Re: PrepareStatement.execute() blocked because of long time'create index' operation, connection leak

От

Sehrope Sarkuni

Дата:

07 июня 2016 г., 18:31:06

On Tue, Jun 7, 2016 at 8:17 AM, Dave Cramer <pg@fastcrypt.com> wrote:

On 7 June 2016 at 08:01, jingzhi.zhang@outlook.com <jingzhi.zhang@outlook.com> wrote:
Vladimir,

Thanks :)

I think there’s no firewall in our test environments. However, there’s network control software at client machine.
The network control software occasionally lost the connection.

My question is, if the network connection lost, then jdbc client should return an IOException immediately?
OR blocked forever until TCP connection killed by operating system?

Well the problem is we don't know that the connection has failed until the TCP connection has been killed by the O/S.

Vladimir is proposing keep alive messages so that either it won't fail or we will know about it sooner if it does.

Regards,

-- Sehrope Sarkuni

Founder & CEO | JackDB, Inc. | https://www.jackdb.com/

Re: PrepareStatement.execute() blocked because of long time'create index' operation, connection leak

От

Vitalii Tymchyshyn

Дата:

07 июня 2016 г., 18:34:33

Hi.

Best regards, Vitalii Tymchyshyn

Вт, 7 черв. 2016 08:17 користувач Dave Cramer <pg@fastcrypt.com> пише:

On 7 June 2016 at 08:01, jingzhi.zhang@outlook.com <jingzhi.zhang@outlook.com> wrote:
Vladimir,

Thanks :)

I think there’s no firewall in our test environments. However, there’s network control software at client machine.
The network control software occasionally lost the connection.

My question is, if the network connection lost, then jdbc client should return an IOException immediately?
OR blocked forever until TCP connection killed by operating system?

Well the problem is we don't know that the connection has failed until the TCP connection has been killed by the O/S.

Vladimir is proposing keep alive messages so that either it won't fail or we will know about it sooner if it does.

Dave Cramer

davec@postgresintl.com
www.postgresintl.com

在 2016年6月7日，19:37，Vladimir Sitnikov <sitnikov.vladimir@gmail.com> 写道：

Jingzhi>I use jdbc to create index on a table, the sql of 'create index' cost about 30 minutes.
Jingzhi>Finally I found, the jdbc client thread blocked at PrepareStatement.execute(), not finished.
Jingzhi>At server side, query pg_stat_activity, the connection state already be ‘idle’ (index creation finished).

Jingzhi, Do you have a firewall in between app and the database?
Can it terminate the connection while index is being built?

Dave>You may have to execute the execute() in a background thread to not block the main thread

Dave, I'm afraid it looks like we need keep-alive messages for such long-running transactions.
There's tcpKeepAlive, however it has no way to set specific timeout value.
It could help to detect "broken connection" at java side, though.

It boils down to plug&pray kind of solution, so everybody should ensure there's no firewall that could kill long-lasting TCP connections.

What if we could teach pgjdbc to send a dummy command once a while?
The drawback is it could fill up buffer at the backend side, however sending a sync message once every 10-15 minutes sounds quite innocent.

Vladimir

Re: PrepareStatement.execute() blocked because of long time'create index' operation, connection leak

От

Vladimir Sitnikov

Дата:

07 июня 2016 г., 18:44:45