Re: Issue with logical replication slot during switchover

Поиск
Список
Период
Сортировка
От Fabrice Chapuis
Тема Re: Issue with logical replication slot during switchover
Дата
Msg-id CAA5-nLCVY=cyX9b27f0+-OB_K5DayDXyU93Y_xh6yw=AG956-Q@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Issue with logical replication slot during switchover  (Alexander Kukushkin <cyberdemn@gmail.com>)
Список pgsql-hackers
I propose to modify 

get_local_synced_slots() => to get the local logical slots (the name of the function must be renamed)
and 
local_sync_slot_required() => to get also the slots that have the flag synced = false

then in both our use cases the local slot will be dropped and recreated if it must be resynchronized.
Changes in synchronize_one_slot()  function are not necessary anymore.

static List *
get_local_synced_slots(void)
{
List    *local_slots = NIL;

LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);

for (int i = 0; i < max_replication_slots; i++)
{
ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i];

/* Check if it is a synchronized slot */
if (s->in_use && (s->data.synced || s->data.failover))
{
Assert(SlotIsLogical(s));
local_slots = lappend(local_slots, s);
}
}

LWLockRelease(ReplicationSlotControlLock);

return local_slots;
}

static bool
local_sync_slot_required(ReplicationSlot *local_slot, List *remote_slots)
{
bool remote_exists = false;
bool locally_invalidated = false;
bool synced_slot = false;

foreach_ptr(RemoteSlot, remote_slot, remote_slots)
{
if (strcmp(remote_slot->name, NameStr(local_slot->data.name)) == 0)
{
remote_exists = true;

/*
* If remote slot is not invalidated but local slot is marked as
* invalidated, then set locally_invalidated flag.
*/
SpinLockAcquire(&local_slot->mutex);
locally_invalidated =
(remote_slot->invalidated == RS_INVAL_NONE) &&
(local_slot->data.invalidated != RS_INVAL_NONE);
synced_slot = local_slot->data.synced;
SpinLockRelease(&local_slot->mutex);

break;
}
}

return (remote_exists && !locally_invalidated && synced_slot);
}


I could propose a new patch in that sense.

Regards,

Fabrice


On Tue, Nov 4, 2025 at 2:01 PM Alexander Kukushkin <cyberdemn@gmail.com> wrote:


On Tue, 4 Nov 2025 at 12:15, Fabrice Chapuis <fabrice636861@gmail.com> wrote:
Hi Alexander,
I understand your use case but I think it won't work like this because of how is implemented get_local_synced_slots()
this function will return slots which synced flag to true only. 

Yeah. Sorry, my bad, you are absolutely right about get_local_synced_slots() function.
I didn't validate this part of the patch, because first I wanted to get feedback from hackers.
However, it looks like the people who build it never run Postgres in production and therefore don't understand the problem and the pain.
I will probably just add yet another workaround to Patroni and start dropping logical replication slots on standby with failover=true and synced=false.

--
Regards,
--
Alexander Kukushkin

В списке pgsql-hackers по дате отправления: