Re: [PoC] pg_upgrade: allow to upgrade publisher node

Поиск

Список

Период

Сортировка

От	Amit Kapila
Тема	Re: [PoC] pg_upgrade: allow to upgrade publisher node
Дата	18 июля 2023 г. 09:06:51
Msg-id	CAA4eK1L6fmTAGS3pY1YHGHhreg424wH6QwYbxqyV_7OF2AXGjw@mail.gmail.com обсуждение исходный текст
Ответ на	Re: [PoC] pg_upgrade: allow to upgrade publisher node (Amit Kapila <amit.kapila16@gmail.com>)
Ответы	RE: [PoC] pg_upgrade: allow to upgrade publisher node
Список	pgsql-hackers

Дерево обсуждения

On Mon, Jul 17, 2023 at 6:19 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Jun 30, 2023 at 7:29 PM Hayato Kuroda (Fujitsu)
> <kuroda.hayato@fujitsu.com> wrote:
> >
> > I have analyzed more, and concluded that there are no difference between manual
> > and shutdown checkpoint.
> >
> > The difference was whether the CHECKPOINT record has been decoded or not.
> > The overall workflow of this test was:
> >
> > 1. do INSERT
> > (2. do CHECKPOINT)
> > (3. decode CHECKPOINT record)
> > 4. receive feedback message from standby
> > 5. do shutdown CHECKPOINT
> >
> > At step 3, the walsender decoded that WAL and set candidate_xmin_lsn. The stucktrace was:
> > standby_decode()->SnapBuildProcessRunningXacts()->LogicalIncreaseXminForSlot().
> >
> > At step 4, the confirmed_flush of the slot was updated, but ReplicationSlotSave()
> > was executed only when the slot->candidate_xmin_lsn had valid lsn. If step 2 and
> > 3 are misssed, the dirty flag is not set and the change is still on the memory.
> >
> > FInally, the CHECKPOINT was executed at step 5. If step 2 and 3 are misssed and
> > the patch from Julien is not applied, the updated value will be discarded. This
> > is what I observed. The patch forces to save the logical slot at the shutdown
> > checkpoint, so the confirmed_lsn is save to disk at step 5.
> >
>
> I see your point but there are comments in walsender.c which indicates
> that we also wait for step-5 to get replicated. See [1] and comments
> atop walsender.c. If this is true then we don't need a special check
> as you have in patch 0003 or at least it doesn't seem to be required
> in all cases.
>

I have studied this a bit more and it seems that is true for physical
walsenders where we set the state of walsender as WALSNDSTATE_STOPPING
in XLogSendPhysical, then the checkpointer finishes writing checkpoint
record and then postmaster sends SIGUSR2 for walsender to exit. IIUC,
this whole logic of different stop states has been introduced in
commit c6c3334364 based on the discussion in the thread [1]. As per my
understanding, logical walsenders don't seem to be waiting for
shutdown checkpoint record and finishes before even we LOG that
record. It seems that the behavior of logical walsenders is different
from physical walsenders where we wait for them to send even the final
shutdown checkpoint record before they finish. If so, then we won't be
able to switchover to logical subscribers even in case of a clean
shutdown. Am, I missing something?

[1] - https://www.postgresql.org/message-id/CAHGQGwEsttg9P9LOOavoc9d6VB1zVmYgfBk%3DLjsk-UL9cEf-eA%40mail.gmail.com

--
With Regards,
Amit Kapila.

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: [PoC] pg_upgrade: allow to upgrade publisher node