Re: failover logical replication slots
От | Amit Kapila |
---|---|
Тема | Re: failover logical replication slots |
Дата | |
Msg-id | CAA4eK1+wHTNZcODabt53e+1OExc5EoLzdLAWEfbAWPECJVBDFQ@mail.gmail.com обсуждение исходный текст |
Список | pgsql-hackers |
On Wed, Jun 11, 2025 at 10:17 PM Fabrice Chapuis <fabrice636861@gmail.com> wrote: > > Thanks for your reply. > The problem I see is that after creating a new subscription, we have: > > 1) if a failover occurs, on the new primary node, the failover and sync flags are both set to true, so there's no problem. > > 2) when the old node returns as a secondary in the cluster, the failover flag is set to true and the sync flag is set tofalse then > the error message is generated: ERROR: exiting from slot synchronization because same name slot "sub_test" already existson the standby > > Why not change the value of the synced flag when the standby is joining the cluster ? If the slot on the primary node hasthe same name as the slot on the secondary node and the failover flag is set to true, > > if ((slot = SearchNamedReplicationSlot(remote_slot->name, true))) { > slot->data.synced = true > ... IIUC, Hou-san also mentioned the same idea, but it is not that straightforward because the user may have created a logical slot with the same name but with a few other different properties like two_phase, slot_type, etc. I think we can try to compare all such slot properties to ensure that we can overwrite the same name slot, but there is still a chance that we may overwrite a slot that the user has created for some other purpose. Now, we may want to extend this functionality such that we give some knob to user which allows us to overwrite the existing slots with same name. Then user can use this knob (GUC or something else) when starting the node as standby after switchover and allow the overwrite for existing slots. As mentioned by Hou-San and Dilip, I also think it is more important for the old node that comes as a standby to remove logical slots to avoid WAL accumulation. For example, we can provide a function like pg_drop_all_slots() with a type parameter indicating logical or physical, and then utilities like patroni that provide switchover functionality can use that function to remove all existing slots (maybe keep the slots that are required for failover) when starting the node as a standby. -- With Regards, Amit Kapila.
В списке pgsql-hackers по дате отправления: