Re: [PoC] pg_upgrade: allow to upgrade publisher node

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: [PoC] pg_upgrade: allow to upgrade publisher node
Дата
Msg-id CAA4eK1JmnJkNjYy3qGc-mbwhjOwpHSJ_RCL8b71ABmVNXhzfPA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [PoC] pg_upgrade: allow to upgrade publisher node  (Masahiko Sawada <sawada.mshk@gmail.com>)
Ответы Re: [PoC] pg_upgrade: allow to upgrade publisher node  (Julien Rouhaud <rjuju123@gmail.com>)
Re: [PoC] pg_upgrade: allow to upgrade publisher node  (Masahiko Sawada <sawada.mshk@gmail.com>)
Список pgsql-hackers
On Sun, Aug 6, 2023 at 6:02 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Aug 2, 2023 at 5:13 PM Hayato Kuroda (Fujitsu)
> <kuroda.hayato@fujitsu.com> wrote:
> >
> > > 4.
> > > + /*
> > > + * Check that all logical replication slots have reached the current WAL
> > > + * position.
> > > + */
> > > + res = executeQueryOrDie(conn,
> > > + "SELECT slot_name FROM pg_catalog.pg_replication_slots "
> > > + "WHERE (SELECT count(record_type) "
> > > + " FROM pg_catalog.pg_get_wal_records_content(confirmed_flush_lsn,
> > > pg_catalog.pg_current_wal_insert_lsn()) "
> > > + " WHERE record_type != 'CHECKPOINT_SHUTDOWN') <> 0 "
> > > + "AND temporary = false AND wal_status IN ('reserved', 'extended');");
> > >
> > > I think this can unnecessarily lead to reading a lot of WAL data if
> > > the confirmed_flush_lsn for a slot is too much behind. Can we think of
> > > improving this by passing the number of records to read which in this
> > > case should be 1?
> >
> > I checked and pg_wal_record_info() seemed to be used for the purpose. I tried to
> > move the functionality to core.
>
> IIUC the above query checks if the WAL record written at the slot's
> confirmed_flush_lsn is a CHECKPOINT_SHUTDOWN, but there is no check if
> this WAL record is the latest record.
>

Yeah, I also think there should be some way to ensure this. How about
passing the number of records to read to this API? Actually, that will
address my other concern as well where the current API can lead to
reading an unbounded number of records if the confirmed_flush_lsn
location is far behind the CHECKPOINT_SHUTDOWN. Do you have any better
ideas to address it?

> Therefore, I think it's quite
> possible that slot's confirmed_flush_lsn points to previous
> CHECKPOINT_SHUTDOWN, for example, in cases where the subscription was
> disabled after the publisher shut down and then some changes are made
> on the publisher. We might want to add that check too but it would not
> work. Because some WAL records could be written (e.g., by autovacuums)
> during pg_upgrade before checking the slot's confirmed_flush_lsn.
>

I think autovacuum is not enabled during the upgrade. See comment "Use
-b to disable autovacuum." in start_postmaster(). However, I am not
sure if there can't be any additional WAL from checkpointer or
bgwriter. Checkpointer has a code that ensures that if there is no
important WAL activity then it would be skipped. Similarly, bgwriter
also doesn't LOG xl_running_xacts unless there is an important
activity. I feel if there is a chance of any WAL activity during the
upgrade, we need to either change the check to ensure such WAL records
are expected or document the same in some way.

--
With Regards,
Amit Kapila.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andy Fan
Дата:
Сообщение: Re: Extract numeric filed in JSONB more effectively
Следующее
От: Thomas Munro
Дата:
Сообщение: Re: Postmaster self-deadlock due to PLT linkage resolution