Re: Improve pg_sync_replication_slots() to wait for primary to advance
От | Amit Kapila |
---|---|
Тема | Re: Improve pg_sync_replication_slots() to wait for primary to advance |
Дата | |
Msg-id | CAA4eK1JHnqdOaQ-Y4ALJQgeLyoJvg1XPvgL3hss+WOpwChbJmw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Improve pg_sync_replication_slots() to wait for primary to advance (shveta malik <shveta.malik@gmail.com>) |
Ответы |
Re: Improve pg_sync_replication_slots() to wait for primary to advance
|
Список | pgsql-hackers |
On Fri, Aug 1, 2025 at 2:50 PM shveta malik <shveta.malik@gmail.com> wrote: > > 5) > I tried a test where there were 4 slots on the publisher, where one > was getting used while the others were not. Initiated > pg_sync_replication_slots on standby. Forced unused slots to be > invalidated by setting idle_replication_slot_timeout=60 on primary, > due to which API finished but gave a warning: > > postgres=# SELECT pg_sync_replication_slots(); > WARNING: aborting initial sync for slot "failover_slot" > DETAIL: This slot was invalidated on the primary server. > WARNING: aborting initial sync for slot "failover_slot2" > DETAIL: This slot was invalidated on the primary server. > WARNING: aborting initial sync for slot "failover_slot3" > DETAIL: This slot was invalidated on the primary server. > pg_sync_replication_slots > --------------------------- > > (1 row) > > Do we need these warnings here? I think we can have it as a LOG rather > than having it on console. Thoughts? > What is the behaviour of a slotsync worker in the same case? I don't see any such WARNING messages in the code of slotsync worker, so why do we want a different behaviour here? Few other comments: ====================== 1. update_and_persist_local_synced_slot() { ... + /* + * For SQL API synchronization, we wait for the remote slot to catch up + * here, since we can't assume the SQL API will be called again soon. + * We will retry the sync once the slot catches up. + * + * Note: This will return false if a promotion is triggered on the + * standby while waiting, in which case we stop syncing and drop the + * temporary slot. + */ + if (!wait_for_primary_slot_catchup(wrconn, remote_slot)) + return false; Why is the wait added at this level? Shouldn't it be at API level aka in SyncReplicationSlots() or pg_sync_replication_slots() similar to what we do in ReplSlotSyncWorkerMain() for slotsync workers? 2. REPLICATION_SLOTSYNC_MAIN "Waiting in main loop of slot sync worker." ... +REPLICATION_SLOTSYNC_PRIMARY_CATCHUP "Waiting for the primary to catch-up." Can't we reuse existing waitevent REPLICATION_SLOTSYNC_MAIN? We may want to change the description. Is there a specific reason to add a new wait_event for this API? -- With Regards, Amit Kapila.
В списке pgsql-hackers по дате отправления: