Re: [PoC] pg_upgrade: allow to upgrade publisher node

Поиск
Список
Период
Сортировка
От vignesh C
Тема Re: [PoC] pg_upgrade: allow to upgrade publisher node
Дата
Msg-id CALDaNm1GOAtTQqxracVKx+C5QjBiJFuZ2+W+pbGy_bre-sf+Vw@mail.gmail.com
обсуждение исходный текст
Ответ на RE: [PoC] pg_upgrade: allow to upgrade publisher node  ("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>)
Ответы RE: [PoC] pg_upgrade: allow to upgrade publisher node  ("Hayato Kuroda (Fujitsu)" <kuroda.hayato@fujitsu.com>)
Список pgsql-hackers
On Fri, 14 Apr 2023 at 16:00, Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Dear Julien,
>
> > Sorry for the delay, I didn't had time to come back to it until this afternoon.
>
> No issues, everyone is busy:-).
>
> > I don't think that your analysis is correct.  Slots are guaranteed to be
> > stopped after all the normal backends have been stopped, exactly to avoid such
> > extraneous records.
> >
> > What is happening here is that the slot's confirmed_flush_lsn is properly
> > updated in memory and ends up being the same as the current LSN before the
> > shutdown.  But as it's a logical slot and those records aren't decoded, the
> > slot isn't marked as dirty and therefore isn't saved to disk.  You don't see
> > that behavior when doing a manual checkpoint before (per your script comment),
> > as in that case the checkpoint also tries to save the slot to disk but then
> > finds a slot that was marked as dirty and therefore saves it.
> >
> > In your script's scenario, when you restart the server the previous slot data
> > is restored and the confirmed_flush_lsn goes backward, which explains those
> > extraneous records.
>
> So you meant to say that the key point was that some records which are not sent
> to subscriber do not mark slots as dirty, hence the updated confirmed_flush was
> not written into slot file. Is it right? LogicalConfirmReceivedLocation() is called
> by walsender when the process gets reply from worker process, so your analysis
> seems correct.
>
> > It's probably totally harmless to throw away that value for now (and probably
> > also doesn't lead to crazy amount of work after restart, I really don't know
> > much about the logical slot code), but clearly becomes problematic with your
> > usecase.  One easy way to fix this is to teach the checkpoint code to force
> > saving the logical slots to disk even if they're not marked as dirty during a
> > shutdown checkpoint, as done in the attached v1 patch (renamed as .txt to not
> > interfere with the cfbot).  With this patch applied I reliably only see a final
> > shutdown checkpoint record with your scenario.
> >
> > Now such a change will make shutdown a bit more expensive when using logical
> > replication, even if in 99% of cases you will not need to save the
> > confirmed_flush_lsn value, so I don't know if that's acceptable or not.
>
> In any case we these records must be advanced. IIUC, currently such records are
> read after rebooting but ingored, and this patch just skips them. I have not measured,
> but there is a possibility that is not additional overhead, just a trade-off.
>
> Currently I did not come up with another solution, so I have included your patch.
> Please see 0002.
>
> Additionally, I added a checking functions in 0003
> According to pg_resetwal and other functions, the length of CHECKPOINT_SHUTDOWN
> record seems (SizeOfXLogRecord + SizeOfXLogRecordDataHeaderShort + sizeof(CheckPoint)).
> Therefore, the function ensures that the difference between current insert position
> and confirmed_flush_lsn is less than (above + page header).

Logical replication slots can be created only if wal_level >= logical,
currently we do not have any check to see if wal_level >= logical if
"--include-logical-replication-slots" option is specified. If
include-logical-replication-slots is specified with pg_upgrade, we
will be creating replication slots after a lot of steps like
performing prechecks, analyzing, freezing, deleting, restoring,
copying, setting related objects and then create replication slot,
where we will be erroring out after a lot of time(Many cases
pg_upgrade takes a lot of hours to perform these operations). I feel
it would be better to add a check in the beginning itself somewhere in
check_new_cluster to see if wal_level is set appropriately in case of
include-logical-replication-slot option to detect and throw an error
early itself.

Regards,
Vignesh



В списке pgsql-hackers по дате отправления:

Предыдущее
От: vignesh C
Дата:
Сообщение: Re: [PoC] pg_upgrade: allow to upgrade publisher node
Следующее
От: Jim Jones
Дата:
Сообщение: Re: Tab completion for GRANT MAINTAIN