RE: Newly created replication slot may be invalidated by checkpoint
От | Vitaly Davydov |
---|---|
Тема | RE: Newly created replication slot may be invalidated by checkpoint |
Дата | |
Msg-id | 1596c1-68d40400-9-93b4080@17709609 обсуждение исходный текст |
Ответ на | RE: Newly created replication slot may be invalidated by checkpoint ("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>) |
Ответы |
RE: Newly created replication slot may be invalidated by checkpoint
|
Список | pgsql-hackers |
Dear Amit, Hayato On Wednesday, September 24, 2025 14:31 MSK, "Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com> wrote: >> I was thinking some more about this solution. Won't it lead to the >> same problem if ReplicationSlotReserveWal() calls >> ReplicationSlotsComputeRequiredLSN() after the above calculation of >> checkpointer? > Exactly. I verified that in your patch, the invalidation can still happen if > we cannot finish the LSN computation before the KeepLogSegments(). Yes. The moment, when WAL reservation takes place is the call of ReplicationSlotsComputeRequiredLSN which updates the oldest slots' lsn (XLogCtl->replicationSlotMinLSN). If it occurs at the moment between KeepLogSeg and RemoveOldXlogFiles, such reservation will not be taken into account. This behaviour seems to be before commit 2090edc6f32f652a2c, but the probability of such race condition was too slow due to the short time period between KeepLogSeg and RemoveOldXlogFiles. The commit 2090edc6f32f652a2c increased the probability of such race condition because CheckPointGuts can take greater time to execute. The attached patch doesn't solve the original problem completely but it decreases the probability of such race condition, as it was before the commit. I propose to apply this patch and then to think how to resolve this race condition, which seems to take place in 18 and master as well. I updated the patch by improving some comments as suggested by Amit. With best regards, Vitaly
Вложения
В списке pgsql-hackers по дате отправления: