Re: using an end-of-recovery record in all cases

Поиск
Список
Период
Сортировка
От Kyotaro Horiguchi
Тема Re: using an end-of-recovery record in all cases
Дата
Msg-id 20220420.104107.1281200824149550704.horikyota.ntt@gmail.com
обсуждение исходный текст
Ответ на Re: using an end-of-recovery record in all cases  (Nathan Bossart <nathandbossart@gmail.com>)
Список pgsql-hackers
At Tue, 19 Apr 2022 13:37:59 -0700, Nathan Bossart <nathandbossart@gmail.com> wrote in 
> On Mon, Apr 18, 2022 at 04:44:03PM -0400, Robert Haas wrote:
> > Here is a new version of the patch which, unlike v1, I think is
> > something we could seriously consider applying (not before v16, of
> > course). It now removes CHECKPOINT_END_OF_RECOVERY completely, and I
> > attach a second patch as well which nukes checkPoint.PrevTimeLineID as
> > well.
> 
> I'd like to add a big +1 for this change.  IIUC this should help with some
> of the problems I've noted elsewhere [0].

Agreed.

> > I mentioned two problems with $SUBJECT in the first email with this
> > subject line. One was a bug, which Noah has since fixed (thanks,
> > Noah). The other problem is that LogStandbySnapshot() and a bunch of
> > its friends expect latestCompletedXid to always be a normal XID, which
> > is a problem because (1) we currently set nextXid to 3 and (2) at
> > startup, we compute latestCompletedXid = nextXid - 1. The current code
> > dodges this kind of accidentally: the checkpoint that happens at
> > startup is a "shutdown checkpoint" and thus skips logging a standby
> > snapshot, since a shutdown checkpoint is a sure indicator that there
> > are no running transactions. With the changes, the checkpoint at
> > startup happens after we've started allowing write transactions, and
> > thus a standby snapshot needs to be logged also. In the cases where
> > the end-of-recovery record was already being used, the problem could
> > have happened already, except for the fact that those cases involve a
> > standby promotion, which doesn't happen during initdb. I explored a
> > few possible ways of solving this problem.
> 
> Shouldn't latestCompletedXid be set to MaxTransactionId in this case?  Or
> is this related to the logic in FullTransactionIdRetreat() that avoids
> skipping over the "actual" special transaction IDs?

As the result FullTransactionIdRetreat(FirstNormalFullTransactionId)
results in FrozenTransactionId, which looks odd.  It seems to me
rather should be InvalidFullTransactionId, or simply should assert-out
that case.  But incrmenting the very first xid avoid all that
complexity.  It is somewhat hacky but very smiple and understandable.

> > So ... I decided that the best approach here seems to be to just skip
> > FirstNormalTransactionId and use FirstNormalTransactionId + 1 for the
> > first write transaction that the cluster ever processes. That's very
> > simple and doesn't seem likely to break anything else. On the downside
> > it seems a bit grotty, but I don't see anything better, and on the
> > whole, I think with this approach we come out substantially ahead.
> > 0001 removes 3 times as many lines as it inserts, and 0002 saves a few
> > more lines of code.
> 
> This doesn't seem all that bad to me.  It's a little hacky, but it's very
> easy to understand and only happens once per initdb.  I don't think it's
> worth any extra complexity.

+1.

> > Now, I still don't really know that there isn't some theoretical
> > difficulty here that makes this whole approach a non-starter, but I
> > also can't think of what it might be. If the promotion case, which has
> > used the end-of-recovery record for many years, is basically safe,
> > despite the fact that it switches TLIs, then it seems to me that the
> > crash recovery case, which doesn't have that complication, ought to be
> > safe too. But I might well be missing something, so if you see a
> > problem, please speak up!
> 
> Your reasoning seems sound to me.
> 
> [0] https://postgr.es/m/C1EE64B0-D4DB-40F3-98C8-0CED324D34CB%40amazon.com

FWIW, I don't find a flaw in the reasoning, too.

By the way do we need to leave CheckPoint.PrevTimeLineID? It is always
the same value with ThisTimeLineID.  The most dubious part is
ApplyWalRecord but XLOG_CHECKPOINT_SHUTDOWN no longer induces timeline
switch.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: minor MERGE cleanups
Следующее
От: Peter Smith
Дата:
Сообщение: Re: Handle infinite recursion in logical replication setup