Re: BUG #15888: Bogus "idle in transaction" state for logicaldecoding client after creating a slot
От | Andres Freund |
---|---|
Тема | Re: BUG #15888: Bogus "idle in transaction" state for logicaldecoding client after creating a slot |
Дата | |
Msg-id | 20190709201557.y5yt57z2za2hu6l7@alap3.anarazel.de обсуждение исходный текст |
Ответ на | Re: BUG #15888: Bogus "idle in transaction" state for logicaldecoding client after creating a slot (Michael Paquier <michael@paquier.xyz>) |
Ответы |
Re: BUG #15888: Bogus "idle in transaction" state for logicaldecoding client after creating a slot
(Alvaro Herrera <alvherre@2ndquadrant.com>)
|
Список | pgsql-bugs |
Hi, On 2019-07-04 13:35:12 +0900, Michael Paquier wrote: > On Wed, Jul 03, 2019 at 08:03:24AM +0000, PG Bug reporting form wrote: > > Run: > > > > pg_recvlogical -S foo -f /dev/null -d dbname --create-slot --start > > > > and pg_stat_activity will show something like this until that session > > disconnects: > > > > which is quite obviously bogus. > > I think that's the transaction which is started when exporting the > snapshot at logical slot creation. Or in short: > #2 0x000055b5f8e5720f in StartTransactionCommand () at xact.c:2699 > #3 0x000055b5f90fab4e in SnapBuildExportSnapshot > (builder=0x55b5fa3427b0) at snapbuild.c:575 > #4 0x000055b5f90fe5df in CreateReplicationSlot (cmd=0x55b5fa2914e8) > at walsender.c:843 > > SnapBuildExportSnapshot() needs to keep a transaction context opened > to keep the wanted xmin around. In PG >= 10, this problem gets > "solved" by using NOEXPORT_SNAPSHOT in the CREATE_REPLICATION_SLOT > command. One idea would be to enforce a commit in this case from > CreateReplicationSlot/streamutil.c? We know that pg_recvlogical is > not going to use this snapshot anyway... The transaction ought to be finished by exec_replication_command() calling SnapBuildClearExportedSnapshot(), which then does AbortCurrentTransaction(). So something's wrong if that's not working anymore. I just tested this on a slightly older v12 checkout, and the general mechanism works: S1(replication): postgres[13464][1]=# CREATE_REPLICATION_SLOT foo LOGICAL test_decoding; S2(plain): postgres[13317][1]=# SELECT state, wait_event_type, wait_event, backend_xid, backend_xmin FROM pg_stat_activity WHERE pid= 13464; ┌─────────────────────┬─────────────────┬────────────┬─────────────┬──────────────┐ │ state │ wait_event_type │ wait_event │ backend_xid │ backend_xmin │ ├─────────────────────┼─────────────────┼────────────┼─────────────┼──────────────┤ │ idle in transaction │ Client │ ClientRead │ (null) │ 1494339463 │ └─────────────────────┴─────────────────┴────────────┴─────────────┴──────────────┘ (1 row) S1(replication): postgres[13464][1]=# IDENTIFY_SYSTEM; ┌─────────────────────┬──────────┬──────────────┬──────────┐ │ systemid │ timeline │ xlogpos │ dbname │ ├─────────────────────┼──────────┼──────────────┼──────────┤ │ 6704433050907079638 │ 1 │ 3AC/1A25B700 │ postgres │ └─────────────────────┴──────────┴──────────────┴──────────┘ S2(plain): postgres[13317][1]=# SELECT state, wait_event_type, wait_event, backend_xid, backend_xmin FROM pg_stat_activity WHERE pid= 13464; ┌───────┬─────────────────┬────────────┬─────────────┬──────────────┐ │ state │ wait_event_type │ wait_event │ backend_xid │ backend_xmin │ ├───────┼─────────────────┼────────────┼─────────────┼──────────────┤ │ idle │ Client │ ClientRead │ (null) │ (null) │ └───────┴─────────────────┴────────────┴─────────────┴──────────────┘ (1 row) I *can* reproduce with 9.6. It's worthwhile to note that the transaction *is* actually aborted, via the path above. As far as I can tell the problem is "solely" that we never reach a set_ps_display() that resets the 'idle in transaction' bit, because START_REPLICATION "takes over" the connection once started, and we don't go through PostgresMain() again. The problem doesn't exist in later versions, because there exec_replication_command() does an explicit /* Report to pgstat that this process is now idle */ pgstat_report_activity(STATE_IDLE, NULL); That's due to commit d02974e32e028fc078d8f5eca1d6a4516efb0aa6 Author: Magnus Hagander <magnus@hagander.net> Date: 2017-12-29 16:19:51 +0100 Properly set base backup backends to active in pg_stat_activity which is post 9.6. Not sure I really feel a huge appetite for whacking this around in the back branches. Greetings, Andres Freund
В списке pgsql-bugs по дате отправления:
Предыдущее
От: Tom LaneДата:
Сообщение: Re: ALTER TABLE results in "ERROR: could not open relation with OID 43707388"
Следующее
От: Andres FreundДата:
Сообщение: Re: BUG #15899: Valgrind detects errors on create gist index