Re: [BUG] non archived WAL removed during production crash recovery
От | Kyotaro Horiguchi |
---|---|
Тема | Re: [BUG] non archived WAL removed during production crash recovery |
Дата | |
Msg-id | 20200421.120925.76453535156648004.horikyota.ntt@gmail.com обсуждение исходный текст |
Ответ на | Re: [BUG] non archived WAL removed during production crash recovery (Michael Paquier <michael@paquier.xyz>) |
Ответы |
Re: [BUG] non archived WAL removed during production crash recovery
|
Список | pgsql-bugs |
At Tue, 21 Apr 2020 11:15:01 +0900, Michael Paquier <michael@paquier.xyz> wrote in > On Mon, Apr 20, 2020 at 02:22:35PM +0200, Jehan-Guillaume de Rorthais wrote: > > The problem is that we would have to read the controldata file each time we > > wonder if a segment should be archived/removed. Moreover, the controldata > > file might not be in sync quickly enough with the real state for some other > > code path or futur needs. > > I don't think that this is what Horiguchi-san meant here. What I got > from his previous message would be to be to copy the shared value from > the control file when necessary, and have the shared state use only a > subset of the existing values of DBState, aka: > - DB_IN_CRASH_RECOVERY > - DB_IN_ARCHIVE_RECOVERY > - DB_IN_PRODUCTION First I thought as above, but I thought that we could use ControlFile->state itself in this case, by regarding the symbols less than DB_IN_CRASH_RECOVERY as RECOVERY_STATE_CRASH. I don't think there's no problem if the update to DB_IN_ARCHIVE_RECOVERY reaches checkpointer with some delay. > Still, that sounds wrong to me because then somebody would be tempted > to change the shared value thinking that things like DB_SHUTDOWNING, > DB_SHUTDOWNED_* or DB_STARTUP are valid but we don't want that here. That is not an issue if we just use DBState to know whether we have started archive recovery. > Note that there may be a case for DB_STARTUP to be used in > XLOGShmemInit(), but I'd rather let the code use the safest default, > DB_IN_CRASH_RECOVERY to control that we won't remove .ready files by > default until the startup process sees fit to do the actual switch > depending on the checkpoint record lookup, if archive recovery was > actually requested, etc. I'm not sure I read this correctly. But I think I agree to this. + if (!XLogArchivingAlways() && + GetRecoveryState() == RECOVERY_STATE_ARCHIVE) Is rewritten as + if (!XLogArchivingAlways() && + GetDBState() > DB_IN_CRASH_RECOVERY) FWIW, what annoyed me is there are three variables that are quite similar but has different domains, ControlFile->state, XLogCtl->SharedRecoveryState, and LocalRecoveryInProgress. I didn't mind there were two, but three seems a bit too many to me. But it may be different issue. > > Indeed, Benoît Lobréau reported this behavior to me. > > Noted. Thanks for the information. I don't think that I have ever > met Benoît in person, do I? Tell him that I owe him one beer or a > beverage of his choice when we meet IRL, and that he had better use > this message-id to make me keep my promise :) regards. -- Kyotaro Horiguchi NTT Open Source Software Center
В списке pgsql-bugs по дате отправления: