Re: Race condition in recovery?

Поиск
Список
Период
Сортировка
От Kyotaro Horiguchi
Тема Re: Race condition in recovery?
Дата
Msg-id 20210511.171157.600145309913652528.horikyota.ntt@gmail.com
обсуждение исходный текст
Ответ на Re: Race condition in recovery?  (Dilip Kumar <dilipbalaut@gmail.com>)
Ответы Re: Race condition in recovery?  (Dilip Kumar <dilipbalaut@gmail.com>)
Список pgsql-hackers
At Mon, 10 May 2021 14:27:21 +0530, Dilip Kumar <dilipbalaut@gmail.com> wrote in 
> On Mon, May 10, 2021 at 2:05 PM Kyotaro Horiguchi
> <horikyota.ntt@gmail.com> wrote:
> 
> > I thought that the reason using receiveTLI instead of
> > recoveryTargetTLI here is that there's a case where receiveTLI is the
> > future of recoveryTarrgetTLI but I haven't successfully had such a
> > situation.  If I set recovoryTargetTLI to a TLI that standby doesn't
> > know but primary knows, validateRecoveryParameters immediately
> > complains about that before reaching there.  Anyway the attached
> > assumes receiveTLI may be the future of recoveryTargetTLI.
> 
> If you see the note in this commit. It says without the timeline
> history file, so does it trying to say that although receiveTLI is the
> ancestor of recovoryTargetTLI,  it can not detect that because of the
> absence of the TL.history file ?

Yeah, it reads so for me and it works as described.  What I don't
understand is that why the patch uses receiveTLI, not
recovoryTargetTLI to load timeline hisotry in
WaitForWALToBecomeAvailable.  The only possible reason is that there
could be a case where receivedTLI is the future of recoveryTargetTLI.
However, AFAICS it's impossible for that case to happen.  At
replication start, requsting TLI is that of the last checkpoint, which
is the same to recoveryTargetTLI, or anywhere in exising expectedTLEs
which must be the past of recoveryTargetTLI. That seems to be already
true at the time replication was made possible to follow a timeline
switch (abfd192b1b).

So I was tempted to just load history for recoveryTargetTLI then
confirm that receiveTLI is in the history.  Actually that change
doesn't harm any of the recovery TAP tests.  It is way simpler than
the last patch. However, I'm not confident that it is right.. ;(

> ee994272ca50f70b53074f0febaec97e28f83c4e
> Author: Heikki Linnakangas <heikki.linnakangas@iki.fi>  2013-01-03 14:11:58
> Committer: Heikki Linnakangas <heikki.linnakangas@iki.fi>  2013-01-03 14:11:58
> .....
>   Without the timeline history file, recovering that file
>     will fail as the older timeline ID is not recognized to be an ancestor of
>     the target timeline. If you try to recover from such a backup, using only
>     streaming replication to fetch the WAL, this patch is required for that to
>     work.
> =====
> 
> >
> > I believe the 004_timeline_switch.pl detects your issue.  And the
> > attached change fixes it.
> 
> I think this fix looks better to me, but I will think more about it
> and give my feedback.  Thanks for quickly coming up with the
> reproducible test case.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: pgsql: autovacuum: handle analyze for partitioned tables
Следующее
От: Fabien COELHO
Дата:
Сообщение: Re: seawasp failing, maybe in glibc allocator