Re: Slot's restart_lsn may point to removed WAL segment after hard restart unexpectedly

Поиск
Список
Период
Сортировка
От Alexander Korotkov
Тема Re: Slot's restart_lsn may point to removed WAL segment after hard restart unexpectedly
Дата
Msg-id CAPpHfdvpcboCAxbj+xPmCWNRZxG=JwW3qRsz4No_OvP3j03ERg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Slot's restart_lsn may point to removed WAL segment after hard restart unexpectedly  ("Vitaly Davydov" <v.davydov@postgrespro.ru>)
Список pgsql-hackers
Hi Vitaly!

On Fri, May 2, 2025 at 8:47 PM Vitaly Davydov <v.davydov@postgrespro.ru> wrote:
> Thank you for the attention to the patch. I updated a patch with a better
> solution for the master branch which can be easily backported to the other
> branches as we agree on the final solution.
>
> Two tests are introduced which are based on Tomas Vondra's test for logical slots
> with injection points from the discussion (patches: [1], [2], [3]). Tests
> are implemented as module tests in src/test/modules/test_replslot_required_lsn
> directory. I slightly modified the original test for logical slots and created a
> new simpler test for physical slots without any additional injection points.

The patchset doesn't seem to build after 371f2db8b0, which adjusted
the signature of the INJECTION_POINT() macro.  Could you, please,
update the patchset accordingly.

> I prepared a new solution (patch [4]) which is also based on Tomas Vondra's
> proposal. With a fresh eye, I realized that it can fix the issue as well. It is
> easy and less invasive to implement. The new solution differs from my original
> solution: it is backward compatible (doesn't require any changes in ReplicationSlot
> structure). My original solution can be backward compatible as well if to
> allocate flushed_restart_lsn in a separate array in shmem, not in the
> ReplicationSlot structure, but I believe the new solution is the better one. If
> you still think that my previous solution is the better (I don't think so), I
> will prepare a backward compatible patch with my previous solution.

I see in 0004 patch we're calling XLogGetReplicationSlotMinimumLSN()
before slots synchronization then use it for WAL truncation.
Generally looks good.  But what about the "if
(InvalidateObsoleteReplicationSlots(...))" branch?  It calls
XLogGetReplicationSlotMinimumLSN() again.  Why would the value
obtained from the latter call reflect slots as they are synchronized
to the disk?

------
Regards,
Alexander Korotkov
Supabase



В списке pgsql-hackers по дате отправления: