Re: [PoC] pg_upgrade: allow to upgrade publisher node

Поиск
Список
Период
Сортировка
От Masahiko Sawada
Тема Re: [PoC] pg_upgrade: allow to upgrade publisher node
Дата
Msg-id CAD21AoA=RQ3-o_u_xKXFAJa2GBTv1Caebi7oGXEZ2=LfwwmUNg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: [PoC] pg_upgrade: allow to upgrade publisher node  (Amit Kapila <amit.kapila16@gmail.com>)
Ответы Re: [PoC] pg_upgrade: allow to upgrade publisher node  (Amit Kapila <amit.kapila16@gmail.com>)
Список pgsql-hackers
On Thu, Aug 17, 2023 at 10:31 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Aug 17, 2023 at 6:07 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Tue, Aug 15, 2023 at 12:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Tue, Aug 15, 2023 at 7:51 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Mon, Aug 14, 2023 at 2:07 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > > On Mon, Aug 14, 2023 at 7:57 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > > > Another idea is (which might have already discussed thoguh) that we check if the latest shutdown checkpoint
LSNin the control file matches the confirmed_flush_lsn in pg_replication_slots view. That way, we can ensure that the
slothas consumed all WAL records before the last shutdown. We don't need to worry about WAL records generated after
startingthe old cluster during the upgrade, at least for logical replication slots. 
> > > > > >
> > > > >
> > > > > Right, this is somewhat closer to what Patch is already doing. But
> > > > > remember in this case we need to remember and use the latest
> > > > > checkpoint from the control file before the old cluster is started
> > > > > because otherwise the latest checkpoint location could be even updated
> > > > > during the upgrade. So, instead of reading from WAL, we need to change
> > > > > so that we rely on the control file's latest LSN.
> > > >
> > > > Yes, I was thinking the same idea.
> > > >
> > > > But it works for only replication slots for logical replication. Do we
> > > > want to check if no meaningful WAL records are generated after the
> > > > latest shutdown checkpoint, for manually created slots (or non-logical
> > > > replication slots)? If so, we would need to have something reading WAL
> > > > records in the end.
> > > >
> > >
> > > This feature only targets logical replication slots. I don't see a
> > > reason to be different for manually created logical replication slots.
> > > Is there something particular that you think we could be missing?
> >
> > Sorry I was not clear. I meant the logical replication slots that are
> > *not* used by logical replication, i.e., are created manually and used
> > by third party tools that periodically consume decoded changes. As we
> > discussed before, these slots will never be able to pass that
> > confirmed_flush_lsn check.
> >
>
> I think normally one would have a background process to periodically
> consume changes. Won't one can use the walsender infrastructure for
> their plugins to consume changes probably by using replication
> protocol?

Not sure.

> Also, I feel it is the plugin author's responsibility to
> consume changes or advance slot to the required position before
> shutdown.

How does the plugin author ensure that the slot consumes all WAL
records including shutdown_checkpoint before shutdown?

>
> > After some thoughts, one thing we might
> > need to consider is that in practice, the upgrade project is performed
> > during the maintenance window and has a backup plan that revert the
> > upgrade process, in case something bad happens. If we require the
> > users to drop such logical replication slots, they cannot resume to
> > use the old cluster in that case, since they would need to create new
> > slots, missing some changes.
> >
>
> Can't one keep the backup before removing slots?

Yes, but restoring the back could take time.

>
> > Other checks in pg_upgrade seem to be
> > compatibility checks that would eventually be required for the upgrade
> > anyway. Do we need to consider this case? For example, we do that
> > confirmed_flush_lsn check for only the slots with pgoutput plugin.
> >
>
> I think one is allowed to use pgoutput plugin even for manually
> created slots. So, such a check may not work.

Right, but I thought it's a very rare case.

Since the slot's flushed_confirmed_lsn check is not a compatibility
check unlike the existing check, I wonder if we can make it optional.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: WIP: new system catalog pg_wait_event
Следующее
От: Tom Lane
Дата:
Сообщение: Re: PostgreSQL 16 RC1 + GA release dates