Re: Possible corruption by CreateRestartPoint at promotion
От | Kyotaro Horiguchi |
---|---|
Тема | Re: Possible corruption by CreateRestartPoint at promotion |
Дата | |
Msg-id | 20220427.104353.691219597727808197.horikyota.ntt@gmail.com обсуждение исходный текст |
Ответ на | Re: Possible corruption by CreateRestartPoint at promotion (Nathan Bossart <nathandbossart@gmail.com>) |
Ответы |
Re: Possible corruption by CreateRestartPoint at promotion
|
Список | pgsql-hackers |
At Tue, 26 Apr 2022 11:33:49 -0700, Nathan Bossart <nathandbossart@gmail.com> wrote in > On Wed, Mar 16, 2022 at 10:24:44AM +0900, Kyotaro Horiguchi wrote: > > While discussing on additional LSNs in checkpoint log message, > > Fujii-san pointed out [2] that there is a case where > > CreateRestartPoint leaves unrecoverable database when concurrent > > promotion happens. That corruption is "fixed" by the next checkpoint > > so it is not a severe corruption. > > I suspect we'll start seeing this problem more often once end-of-recovery > checkpoints are removed [0]. Would you mind creating a commitfest entry > for this thread? I didn't see one. I'm not sure the patch makes any change here, because restart points don't run while crash recovery, since no checkpoint records seen during a crash recovery. Anyway the patch doesn't apply anymore so rebased, but only the one for master for the lack of time for now. > > AFAICS since 9.5, no check(/restart)pionts won't run concurrently with > > restartpoint [3]. So I propose to remove the code path as attached. > > Yeah, this "quick hack" has been around for some time (2de48a8), and I > believe much has changed since then, so something like what you're > proposing is probably the right thing to do. Thanks for checking! > > /* Also update the info_lck-protected copy */ > > SpinLockAcquire(&XLogCtl->info_lck); > > - XLogCtl->RedoRecPtr = lastCheckPoint.redo; > > + XLogCtl->RedoRecPtr = RedoRecPtr; > > SpinLockRelease(&XLogCtl->info_lck); > > > > /* > > @@ -6984,7 +6987,10 @@ CreateRestartPoint(int flags) > > /* Update the process title */ > > update_checkpoint_display(flags, true, false); > > > > - CheckPointGuts(lastCheckPoint.redo, flags); > > + CheckPointGuts(RedoRecPtr, flags); > > I don't understand the purpose of these changes. Are these related to the > fix, or is this just tidying up? The latter, since the mixed use of two not-guaranteed-to-be-same variables at the same time for the same purpose made me perplexed (but I feel the change can hardly incorporated alone). However, you're right that it is irrelevant to the fix, so removed including other instances of the same. > [0] https://postgr.es/m/CA%2BTgmoY%2BSJLTjma4Hfn1sA7S6CZAgbihYd%3DKzO6srd7Ut%3DXVBQ%40mail.gmail.com regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Вложения
В списке pgsql-hackers по дате отправления: