Re: BUG #17903: There is a bug in the KeepLogSeg()
От | Junwang Zhao |
---|---|
Тема | Re: BUG #17903: There is a bug in the KeepLogSeg() |
Дата | |
Msg-id | CAEG8a3L2BB5k9USO4fC_wQjULGX-1rqaX2JSHn1_0vco8KB28Q@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: BUG #17903: There is a bug in the KeepLogSeg() (Kyotaro Horiguchi <horikyota.ntt@gmail.com>) |
Ответы |
Re: BUG #17903: There is a bug in the KeepLogSeg()
|
Список | pgsql-bugs |
If `segno` can be larger than `currSegNo`, your patch seems to miss the following branch, are you missing this for some particular reason? ``` /* but, keep at least wal_keep_size if that's set */ if (wal_keep_size_mb > 0) { uint64 keep_segs; keep_segs = ConvertToXSegs(wal_keep_size_mb, wal_segment_size); if (currSegNo - segno < keep_segs) <<<< see here { /* avoid underflow, don't go below 1 */ if (currSegNo <= keep_segs) segno = 1; else segno = currSegNo - keep_segs; } } ``` On Thu, Apr 20, 2023 at 11:04 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > > At Wed, 19 Apr 2023 10:26:13 +0000, PG Bug reporting form <noreply@postgresql.org> wrote in > > I found that KeepLogSeg() has a piece of code that is not correctly. > > > > segno may be larger than currSegNo, since the slot_keep_segs variable is of > > type "uint64", in this case the code "if (currSegNo - segno > > > slot_keep_segs)" is incorrect. > > > > "if (currSegNo - segno < keep_segs)" is also the same. > > > > Checkpoint calls the KeepLogSeg function, and there are many operations > > between recptr and XLogGetReplicationSlotMinimumLSN, including updating the > > pg_control file, so segno may be larger than currSegNo. > > Correct. Thanks for the report. > > If checkpointer somehow takes a long time between inserting a > checkpoint record and removing WAL files, while replication advances a > certain distnace, it can actually happen. Although that behavior > doesn't directly affect max_slot_wal_keep_size, it does disrupt the > effect of wal_keep_size. > > The thinko was that we incorrectly assumed the slot minimum LSN can't > be larger than the checkpoint record LSN. We don't need to consider > max_slot_wal_keep_size if the slot minimum LSN is already larger than > currSegNo. > > The attached fix works. However, I can't come up with a reasonable > testing script. > > This dates back to 13, where max_slot_wal_keep_size was introduced. > > regards. > > > -- > Kyotaro Horiguchi > NTT Open Source Software Center -- Regards Junwang Zhao
В списке pgsql-bugs по дате отправления: