Re: Excessive number of replication slots for 12->14 logical replication
От | Masahiko Sawada |
---|---|
Тема | Re: Excessive number of replication slots for 12->14 logical replication |
Дата | |
Msg-id | CAD21AoAw0Oofi4kiDpJBOwpYyBBBkJj=sLUOn4Gd2GjUAKG-fw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Excessive number of replication slots for 12->14 logical replication (Amit Kapila <amit.kapila16@gmail.com>) |
Ответы |
RE: Excessive number of replication slots for 12->14 logical replication
Re: Excessive number of replication slots for 12->14 logical replication |
Список | pgsql-bugs |
On Tue, Aug 30, 2022 at 3:44 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Aug 26, 2022 at 7:04 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > Thanks for the testing. I'll push this sometime early next week (by > > Tuesday) unless Sawada-San or someone else has any comments on it. > > > > Pushed. Tom reported buildfarm failures[1] and I've investigated the cause and concluded this commit is relevant. In process_syncing_tables_for_sync(), we have the following code: UpdateSubscriptionRelState(MyLogicalRepWorker->subid, MyLogicalRepWorker->relid, MyLogicalRepWorker->relstate, MyLogicalRepWorker->relstate_lsn); ReplicationOriginNameForTablesync(MyLogicalRepWorker->subid, MyLogicalRepWorker->relid, originname, sizeof(originname)); replorigin_session_reset(); replorigin_session_origin = InvalidRepOriginId; replorigin_session_origin_lsn = InvalidXLogRecPtr; replorigin_session_origin_timestamp = 0; /* * We expect that origin must be present. The concurrent operations * that remove origin like a refresh for the subscription take an * access exclusive lock on pg_subscription which prevent the previou * operation to update the rel state to SUBREL_STATE_SYNCDONE to * succeed. */ replorigin_drop_by_name(originname, false, false); /* * End streaming so that LogRepWorkerWalRcvConn can be used to drop * the slot. */ walrcv_endstreaming(LogRepWorkerWalRcvConn, &tli); /* * Cleanup the tablesync slot. * * This has to be done after the data changes because otherwise if * there is an error while doing the database operations we won't be * able to rollback dropped slot. */ ReplicationSlotNameForTablesync(MyLogicalRepWorker->subid, MyLogicalRepWorker->relid, syncslotname, sizeof(syncslotname)); If the table sync worker errored at walrcv_endstreaming(), we assumed that both dropping the replication origin and updating relstate are rolled back, which however was wrong. Indeed, the replication origin is not dropped but the in-memory state is reset. Therefore, after the tablesync worker restarts, it starts logical replication with starting point 0/0. Consequently, it ends up applying the transaction that has already been applied. Regards, [1] https://www.postgresql.org/message-id/115136.1662733870%40sss.pgh.pa.us -- Masahiko Sawada
В списке pgsql-bugs по дате отправления: