RE: Synchronizing slots from primary to standby
От | Zhijie Hou (Fujitsu) |
---|---|
Тема | RE: Synchronizing slots from primary to standby |
Дата | |
Msg-id | OS0PR01MB57169DD55EC8D9D1EDB7A0C2946A2@OS0PR01MB5716.jpnprd01.prod.outlook.com обсуждение исходный текст |
Ответ на | Re: Synchronizing slots from primary to standby (Dilip Kumar <dilipbalaut@gmail.com>) |
Ответы |
Re: Synchronizing slots from primary to standby
Re: Synchronizing slots from primary to standby Re: Synchronizing slots from primary to standby Re: Synchronizing slots from primary to standby |
Список | pgsql-hackers |
On Monday, January 8, 2024 2:10 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > On Fri, Jan 5, 2024 at 5:45 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Fri, Jan 5, 2024 at 4:25 PM Dilip Kumar <dilipbalaut@gmail.com> wrote: > > > > > > On Fri, Jan 5, 2024 at 8:59 AM shveta malik <shveta.malik@gmail.com> > wrote: > > > > > > > I was going the the patch set again, I have a question. The below > > > comments say that we keep the failover option as PENDING until we > > > have done the initial table sync which seems fine. But what happens > > > if we add a new table to the publication and refresh the > > > subscription? In such a case does this go back to the PENDING state or > something else? > > > > > > > At this stage, such an operation is prohibited. Users need to disable > > the failover option first, then perform the above operation, and after > > that failover option can be re-enabled. > > Okay, that makes sense to me. During the off-list discussion, Sawada-san proposed one idea which can release the restriction for table sync: instead of relying on the latest WAL position, we can utilize the remote restart_lsn to reserve the WAL when creating a new synced slot on the standby. This approach eliminates the need to wait for the primary server to catch up, thus improving the speed of synced slot creation on the standby in most scenarios. By using this approach, the limitation that prevents users from performing table sync during failover can be eliminated. In previous versions, this restriction existed because table sync slots were often incompletely synchronized to the standby(the slots on primary could not catch up the synced slot). And with this approach, the table sync slots can be efficiently synced to the standby in most cases. However, there could still be rare cases that the WAL around remote restart_lsn has been removed on standby, we will try to reserve the last remaining wal in this case and mark the slot as temporary, these temp slots will be converted to persistent once the remote restart_lsn catches up. We think this idea is promising and here is the V58 patch set which tries to address the idea, the summary of changes for each patch is as follows: V58-0001 1) Enables failover for table sync slot. 2) Removes the restriction on table sync when failover is enabled. 3) Removes tristate handling for failover state. 4) Renames failoverstate to failover. 5) Address Peter's comments[1]. V58-0002 1) Add the document about how to resume logical replication after failover. 2) Don't sync temporary from primary server anymore. 3) Fix one spinlock miss. 4) Fix one CFbot warning. 5) Fixes a bug where last_update_time is not initialized. 6) Reserves WAL based on the remote restart_lsn. 7) Improves and adjusts the tests. 8) remove the separate function wait_for_primary_slot_catchup() and integrate its logic of marking the slot as ready into the main loop. 9) remove the 'i' state of sync_state. The slots that need to wait for the primary to catch up will be marked as TEMPORARY, and they will be converted to PERSISTENT once the remote restart_lsn catches up. Thanks Shveta for working on 1) to 4). V58-0003 Rebases the tests. V58-0004: Address Bertrand comments[2]. Thanks Shveta for working on this. TODO: Add documents to guide user the way to identity if the table sync slot and the main slot is READY that the logical replication can be resumed by subscribing to the new primary. [1] https://www.postgresql.org/message-id/CAHut%2BPvbbPz1%3DT4bzY0_GotUK460Eih41Twjt%3DczJ1z2J8SGEw%40mail.gmail.com [2] https://www.postgresql.org/message-id/ZZa4pLFCe2mAks1m%40ip-10-97-1-34.eu-west-3.compute.internal Best Regards, Hou zj
Вложения
В списке pgsql-hackers по дате отправления: