Обсуждение: Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

Поиск
Список
Период
Сортировка

Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Ashutosh Bapat
Дата:
On Tue, Dec 31, 2024 at 10:15 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> Hi all,
>
> Logical decoding (and logical replication) are available only when
> wal_level = logical. As the documentation says[1], Using the 'logical'
> level increases the WAL volume which could negatively affect the
> performance. For that reason, users might want to start with using
> 'replica', but when they want to use logical decoding they need a
> server restart to increase wal_level to 'logical'. My goal is to allow
> users who are using 'replica' level to use logical decoding without a
> server restart. There are other GUC parameters related to logical
> decoding and logical replication such as max_wal_senders,
> max_logical_replication_workers, and max_replication_slots, but even
> if users set these parameters >0, there would not be a noticeable
> performance impact. And their default values are already >0. So I'd
> like to focus on making only the wal_level dynamic GUC parameter.
> There are several earlier discussions[2][3] but no one has submitted
> patches unless I'm missing something.
>
> The first idea I came up with is to make the wal_level a PGC_SIGHUP
> parameter. However, it affects not only setting 'replica' to 'logical'
> but also setting 'minimal' to 'replica' or higher. I'm not sure the
> latter case is common and it might require a checkpoint. I don't want
> to make the patch complex for uncommon cases.
>
> The second idea is to somehow allow both WAL-logging logical info and
> logical decoding even when wal_level is 'replica'. I've attached a PoC
> patch for that. The patch introduces new SQL functions such as
> pg_activate_logical_decoding() and pg_deactivate_logical_decoding().
> These functions are available only when wal_level is 'repilca'(or
> higher). In pg_activate_logical_decoding(), we set the status of
> logical decoding stored on the shared memory from 'disabled' to
> 'xlog-logical-info', allowing all processes to write logical
> information to WAL records for logical decoding. But the logical
> decoding is still not allowed. Once we confirm all in-progress
> transactions completed, we switch the status to
> 'logical-decoding-ready', meaning that users can create logical
> replication slots and use logical decoding.
>
> Overall, with the patch, there are two ways to enable logical
> decoding: setting wal_level to 'logical' and calling
> pg_activate_logical_decoding() when wal_level is 'replica'. I left the
> 'logical' level for backward compatibility and for users who want to
> enable the logical decoding without calling that SQL function. If we
> can automatically enable the logical decoding when creating the first
> logical replication slot, probably we no longer need the 'logical'
> level. There is room to discuss the user interface. Feedback is very
> welcome.
>

If a server is running at minimal wal_level and they want to enable
logical replication, they would still need a server restart. That
would be rare but not completely absent.

Our documentation says "wal_level determines how much information is
written to the WAL.". Users would may not expect that the WAL amount
changes while wal_level = replica depending upon whether logical
decoding is possible. It may be possible to set the expectations right
by changing the documentation. It's not in the patch, so I am not sure
whether this is considered.

Cloud providers do not like multiple ways of changing configuration
esp. when they can not control it. See [1]. Changing wal_level through
a SQL function may fit the same category.

I agree that it would be a lot of work to make all combinations of
wal_level changes work, but changing wal_level through SIGHUP looks
like a cleaner solution. Is there way that we make the GUC SIGHUP but
disallow certain combinations of old and new values?

[1]
https://www.postgresql.org/message-id/flat/CA%2BVUV5rEKt2%2BCdC_KUaPoihMu%2Bi5ChT4WVNTr4CD5-xXZUfuQw%40mail.gmail.com

--
Best Wishes,
Ashutosh Bapat



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Thu, Jan 9, 2025 at 3:29 AM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Tue, Dec 31, 2024 at 10:15 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > Hi all,
> >
> > Logical decoding (and logical replication) are available only when
> > wal_level = logical. As the documentation says[1], Using the 'logical'
> > level increases the WAL volume which could negatively affect the
> > performance. For that reason, users might want to start with using
> > 'replica', but when they want to use logical decoding they need a
> > server restart to increase wal_level to 'logical'. My goal is to allow
> > users who are using 'replica' level to use logical decoding without a
> > server restart. There are other GUC parameters related to logical
> > decoding and logical replication such as max_wal_senders,
> > max_logical_replication_workers, and max_replication_slots, but even
> > if users set these parameters >0, there would not be a noticeable
> > performance impact. And their default values are already >0. So I'd
> > like to focus on making only the wal_level dynamic GUC parameter.
> > There are several earlier discussions[2][3] but no one has submitted
> > patches unless I'm missing something.
> >
> > The first idea I came up with is to make the wal_level a PGC_SIGHUP
> > parameter. However, it affects not only setting 'replica' to 'logical'
> > but also setting 'minimal' to 'replica' or higher. I'm not sure the
> > latter case is common and it might require a checkpoint. I don't want
> > to make the patch complex for uncommon cases.
> >
> > The second idea is to somehow allow both WAL-logging logical info and
> > logical decoding even when wal_level is 'replica'. I've attached a PoC
> > patch for that. The patch introduces new SQL functions such as
> > pg_activate_logical_decoding() and pg_deactivate_logical_decoding().
> > These functions are available only when wal_level is 'repilca'(or
> > higher). In pg_activate_logical_decoding(), we set the status of
> > logical decoding stored on the shared memory from 'disabled' to
> > 'xlog-logical-info', allowing all processes to write logical
> > information to WAL records for logical decoding. But the logical
> > decoding is still not allowed. Once we confirm all in-progress
> > transactions completed, we switch the status to
> > 'logical-decoding-ready', meaning that users can create logical
> > replication slots and use logical decoding.
> >
> > Overall, with the patch, there are two ways to enable logical
> > decoding: setting wal_level to 'logical' and calling
> > pg_activate_logical_decoding() when wal_level is 'replica'. I left the
> > 'logical' level for backward compatibility and for users who want to
> > enable the logical decoding without calling that SQL function. If we
> > can automatically enable the logical decoding when creating the first
> > logical replication slot, probably we no longer need the 'logical'
> > level. There is room to discuss the user interface. Feedback is very
> > welcome.
> >
>
> If a server is running at minimal wal_level and they want to enable
> logical replication, they would still need a server restart. That
> would be rare but not completely absent.

Currently we don't allow the server to start with the 'minimal' level
and max_wal_senders > 0. Even if we support changing 'minimal' to
'logical' without a server restart, we still need a server restart to
increase max_wal_senders for users who want to use logical
replication. Or we need to eliminate this restriction too. I guess it
would be too complex for such uncommon use cases.

>
> Our documentation says "wal_level determines how much information is
> written to the WAL.". Users would may not expect that the WAL amount
> changes while wal_level = replica depending upon whether logical
> decoding is possible. It may be possible to set the expectations right
> by changing the documentation. It's not in the patch, so I am not sure
> whether this is considered.

We should mention that in the doc. The WAL amount changes depending on
not only wal_level but also other parameters such as wal_log_hints and
full_page_writes.

> Cloud providers do not like multiple ways of changing configuration
> esp. when they can not control it. See [1]. Changing wal_level through
> a SQL function may fit the same category.

Thank you for pointing it out. This would support the idea of
automatically enabling logical decoding.

> I agree that it would be a lot of work to make all combinations of
> wal_level changes work, but changing wal_level through SIGHUP looks
> like a cleaner solution. Is there way that we make the GUC SIGHUP but
> disallow certain combinations of old and new values?

While I agree that it's cleaner I think there is no way today. I think
we need to invent something for that.

Another idea would be to have another SIGHUP GUC parameter to control
logical info as another way to enable logical info WAL-logging while
trying to deprecate the 'logical' level over some releases. While it
doesn't need a SQL function, it could confuse users since we will
require two GUC parameters for doing things that we used to use one
GUC parameter.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Fri, Jan 10, 2025 at 12:33 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Jan 9, 2025 at 3:29 AM Ashutosh Bapat
> <ashutosh.bapat.oss@gmail.com> wrote:
> >
> > On Tue, Dec 31, 2024 at 10:15 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > Hi all,
> > >
> > > Logical decoding (and logical replication) are available only when
> > > wal_level = logical. As the documentation says[1], Using the 'logical'
> > > level increases the WAL volume which could negatively affect the
> > > performance. For that reason, users might want to start with using
> > > 'replica', but when they want to use logical decoding they need a
> > > server restart to increase wal_level to 'logical'. My goal is to allow
> > > users who are using 'replica' level to use logical decoding without a
> > > server restart. There are other GUC parameters related to logical
> > > decoding and logical replication such as max_wal_senders,
> > > max_logical_replication_workers, and max_replication_slots, but even
> > > if users set these parameters >0, there would not be a noticeable
> > > performance impact. And their default values are already >0. So I'd
> > > like to focus on making only the wal_level dynamic GUC parameter.
> > > There are several earlier discussions[2][3] but no one has submitted
> > > patches unless I'm missing something.
> > >
> > > The first idea I came up with is to make the wal_level a PGC_SIGHUP
> > > parameter. However, it affects not only setting 'replica' to 'logical'
> > > but also setting 'minimal' to 'replica' or higher. I'm not sure the
> > > latter case is common and it might require a checkpoint. I don't want
> > > to make the patch complex for uncommon cases.
> > >
> > > The second idea is to somehow allow both WAL-logging logical info and
> > > logical decoding even when wal_level is 'replica'. I've attached a PoC
> > > patch for that. The patch introduces new SQL functions such as
> > > pg_activate_logical_decoding() and pg_deactivate_logical_decoding().
> > > These functions are available only when wal_level is 'repilca'(or
> > > higher). In pg_activate_logical_decoding(), we set the status of
> > > logical decoding stored on the shared memory from 'disabled' to
> > > 'xlog-logical-info', allowing all processes to write logical
> > > information to WAL records for logical decoding. But the logical
> > > decoding is still not allowed. Once we confirm all in-progress
> > > transactions completed, we switch the status to
> > > 'logical-decoding-ready', meaning that users can create logical
> > > replication slots and use logical decoding.
> > >
> > > Overall, with the patch, there are two ways to enable logical
> > > decoding: setting wal_level to 'logical' and calling
> > > pg_activate_logical_decoding() when wal_level is 'replica'. I left the
> > > 'logical' level for backward compatibility and for users who want to
> > > enable the logical decoding without calling that SQL function. If we
> > > can automatically enable the logical decoding when creating the first
> > > logical replication slot, probably we no longer need the 'logical'
> > > level. There is room to discuss the user interface. Feedback is very
> > > welcome.
> > >
> >
> > If a server is running at minimal wal_level and they want to enable
> > logical replication, they would still need a server restart. That
> > would be rare but not completely absent.
>
> Currently we don't allow the server to start with the 'minimal' level
> and max_wal_senders > 0. Even if we support changing 'minimal' to
> 'logical' without a server restart, we still need a server restart to
> increase max_wal_senders for users who want to use logical
> replication. Or we need to eliminate this restriction too. I guess it
> would be too complex for such uncommon use cases.
>
> >
> > Our documentation says "wal_level determines how much information is
> > written to the WAL.". Users would may not expect that the WAL amount
> > changes while wal_level = replica depending upon whether logical
> > decoding is possible. It may be possible to set the expectations right
> > by changing the documentation. It's not in the patch, so I am not sure
> > whether this is considered.
>
> We should mention that in the doc. The WAL amount changes depending on
> not only wal_level but also other parameters such as wal_log_hints and
> full_page_writes.
>
> > Cloud providers do not like multiple ways of changing configuration
> > esp. when they can not control it. See [1]. Changing wal_level through
> > a SQL function may fit the same category.
>
> Thank you for pointing it out. This would support the idea of
> automatically enabling logical decoding.
>
> > I agree that it would be a lot of work to make all combinations of
> > wal_level changes work, but changing wal_level through SIGHUP looks
> > like a cleaner solution. Is there way that we make the GUC SIGHUP but
> > disallow certain combinations of old and new values?
>
> While I agree that it's cleaner I think there is no way today. I think
> we need to invent something for that.

I would like to summarize the proposed approaches thus far:

Regarding the user interface, there are three approaches:

1. Implementing SQL function controls (e.g.,
pg_activate_logical_decoding() and pg_deactivate_logical_decoding()).
This would enable users to activate logical decoding even with
wal_level=replica by calling the SQL function. While cloud providers
seem not like having multiple configuration methods, this could
potentially be managed through appropriate EXECUTE privileges. Another
drawback is the user confusion when 'SHOW wal_level' displays
'replica' despite processes writing WAL records with logical
information. This might be dealt with by implementing a show_hook
function for wal_level.

2. Implementing automatic logical decoding activation. This would
trigger upon creation of the first logical slot and deactivate upon
removal of the final slot. This approach shares the user confusion
concern of the first proposal. Moreover, it presents a significant
limitation: users would be unable to utilize logical decoding on
standby servers without maintaining at least one logical slot on the
primary -- a substantial disadvantage.

3. Converting wal_level to a SIGHUP parameter, thereby supporting all
possible wal_level transition combinations. While this represents the
most elegant solution among the proposals, it necessitates additional
development effort for less common scenarios, such as transitioning
between 'minimal' and 'replica' levels. Such transitions require
specific handling -- for instance, changing between 'minimal' and
'replica' requires a checkpoint, while decreasing from 'replica' to
'minimal' necessitates terminating certain processes like WAL senders
and archiver.

We also had discussion (and I did some research) on the implementation
of increasing/decreasing wal_level online. The basic idea is that we
first enable logical information WAL-logging to all processes while
maintaining the logical decoding in an inactive state. Once we can
guarantee that all processes are writing WAL records with logical
information, we enable the logical decoding. This guarantee can be
achieved by waiting for all concurrent transactions to finish, which
could make us wait for a long time if a transaction is long-running.
Another way is to send a global barrier signal and wait for all
processes to start writing WAL records with logical information. We
have a good facility for that: EmitProcSignalBarrier() and
WaitForProcSignalBarrier(). That way, we don't need to wait for
transaction finishes.

Based on the discussion so far, the idea 3 appears most promising. I
welcome any additional suggestions or preferences.

BTW I'm writing a PoC patch for the idea 3 and using global barriers,
and will share the patch.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Bertrand Drouvot
Дата:
Hi,

On Wed, Jan 22, 2025 at 04:46:00PM -0800, Masahiko Sawada wrote:
> I would like to summarize the proposed approaches thus far:

Thanks!

> Regarding the user interface, there are three approaches:
> 
> 1. Implementing SQL function controls (e.g.,
> pg_activate_logical_decoding() and pg_deactivate_logical_decoding()).
> This would enable users to activate logical decoding even with
> wal_level=replica by calling the SQL function. While cloud providers
> seem not like having multiple configuration methods, this could
> potentially be managed through appropriate EXECUTE privileges. Another
> drawback is the user confusion when 'SHOW wal_level' displays
> 'replica' despite processes writing WAL records with logical
> information. This might be dealt with by implementing a show_hook
> function for wal_level.
> 
> 2. Implementing automatic logical decoding activation. This would
> trigger upon creation of the first logical slot and deactivate upon
> removal of the final slot. This approach shares the user confusion
> concern of the first proposal. Moreover, it presents a significant
> limitation: users would be unable to utilize logical decoding on
> standby servers without maintaining at least one logical slot on the
> primary -- a substantial disadvantage.

Yeah, unless we keep wal_level around but I agree that the following (3.) looks
like the way to go (as it removes any confusion).

> 3. Converting wal_level to a SIGHUP parameter, thereby supporting all
> possible wal_level transition combinations. While this represents the
> most elegant solution among the proposals,

+1

> it necessitates additional
> development effort for less common scenarios, such as transitioning
> between 'minimal' and 'replica' levels. Such transitions require
> specific handling -- for instance, changing between 'minimal' and
> 'replica' requires a checkpoint, while decreasing from 'replica' to
> 'minimal' necessitates terminating certain processes like WAL senders
> and archiver.

Yeah. OTOH switching from replica to minimal is "dangerous" as it makes
previous base backups unusable for point-in-time recovery. So I wonder if it
wouldn't be better to keep a restart mandatory depending of the transition
state (that would probably make users thinking "twice" before doing the
transition that requires a restart). I don't think any GUC does that already but
that might be something to explore, thoughts?

> We also had discussion (and I did some research) on the implementation
> of increasing/decreasing wal_level online. The basic idea is that we
> first enable logical information WAL-logging to all processes while
> maintaining the logical decoding in an inactive state. Once we can
> guarantee that all processes are writing WAL records with logical
> information, we enable the logical decoding. This guarantee can be
> achieved by waiting for all concurrent transactions to finish, which
> could make us wait for a long time if a transaction is long-running.
> Another way is to send a global barrier signal and wait for all
> processes to start writing WAL records with logical information. We
> have a good facility for that: EmitProcSignalBarrier() and
> WaitForProcSignalBarrier(). That way, we don't need to wait for
> transaction finishes.

That sounds like a plan.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Ashutosh Bapat
Дата:
On Thu, Jan 23, 2025 at 6:16 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Fri, Jan 10, 2025 at 12:33 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Thu, Jan 9, 2025 at 3:29 AM Ashutosh Bapat
> > <ashutosh.bapat.oss@gmail.com> wrote:
> > >
> > > On Tue, Dec 31, 2024 at 10:15 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > Hi all,
> > > >
> > > > Logical decoding (and logical replication) are available only when
> > > > wal_level = logical. As the documentation says[1], Using the 'logical'
> > > > level increases the WAL volume which could negatively affect the
> > > > performance. For that reason, users might want to start with using
> > > > 'replica', but when they want to use logical decoding they need a
> > > > server restart to increase wal_level to 'logical'. My goal is to allow
> > > > users who are using 'replica' level to use logical decoding without a
> > > > server restart. There are other GUC parameters related to logical
> > > > decoding and logical replication such as max_wal_senders,
> > > > max_logical_replication_workers, and max_replication_slots, but even
> > > > if users set these parameters >0, there would not be a noticeable
> > > > performance impact. And their default values are already >0. So I'd
> > > > like to focus on making only the wal_level dynamic GUC parameter.
> > > > There are several earlier discussions[2][3] but no one has submitted
> > > > patches unless I'm missing something.
> > > >
> > > > The first idea I came up with is to make the wal_level a PGC_SIGHUP
> > > > parameter. However, it affects not only setting 'replica' to 'logical'
> > > > but also setting 'minimal' to 'replica' or higher. I'm not sure the
> > > > latter case is common and it might require a checkpoint. I don't want
> > > > to make the patch complex for uncommon cases.
> > > >
> > > > The second idea is to somehow allow both WAL-logging logical info and
> > > > logical decoding even when wal_level is 'replica'. I've attached a PoC
> > > > patch for that. The patch introduces new SQL functions such as
> > > > pg_activate_logical_decoding() and pg_deactivate_logical_decoding().
> > > > These functions are available only when wal_level is 'repilca'(or
> > > > higher). In pg_activate_logical_decoding(), we set the status of
> > > > logical decoding stored on the shared memory from 'disabled' to
> > > > 'xlog-logical-info', allowing all processes to write logical
> > > > information to WAL records for logical decoding. But the logical
> > > > decoding is still not allowed. Once we confirm all in-progress
> > > > transactions completed, we switch the status to
> > > > 'logical-decoding-ready', meaning that users can create logical
> > > > replication slots and use logical decoding.
> > > >
> > > > Overall, with the patch, there are two ways to enable logical
> > > > decoding: setting wal_level to 'logical' and calling
> > > > pg_activate_logical_decoding() when wal_level is 'replica'. I left the
> > > > 'logical' level for backward compatibility and for users who want to
> > > > enable the logical decoding without calling that SQL function. If we
> > > > can automatically enable the logical decoding when creating the first
> > > > logical replication slot, probably we no longer need the 'logical'
> > > > level. There is room to discuss the user interface. Feedback is very
> > > > welcome.
> > > >
> > >
> > > If a server is running at minimal wal_level and they want to enable
> > > logical replication, they would still need a server restart. That
> > > would be rare but not completely absent.
> >
> > Currently we don't allow the server to start with the 'minimal' level
> > and max_wal_senders > 0. Even if we support changing 'minimal' to
> > 'logical' without a server restart, we still need a server restart to
> > increase max_wal_senders for users who want to use logical
> > replication. Or we need to eliminate this restriction too. I guess it
> > would be too complex for such uncommon use cases.
> >
> > >
> > > Our documentation says "wal_level determines how much information is
> > > written to the WAL.". Users would may not expect that the WAL amount
> > > changes while wal_level = replica depending upon whether logical
> > > decoding is possible. It may be possible to set the expectations right
> > > by changing the documentation. It's not in the patch, so I am not sure
> > > whether this is considered.
> >
> > We should mention that in the doc. The WAL amount changes depending on
> > not only wal_level but also other parameters such as wal_log_hints and
> > full_page_writes.
> >
> > > Cloud providers do not like multiple ways of changing configuration
> > > esp. when they can not control it. See [1]. Changing wal_level through
> > > a SQL function may fit the same category.
> >
> > Thank you for pointing it out. This would support the idea of
> > automatically enabling logical decoding.
> >
> > > I agree that it would be a lot of work to make all combinations of
> > > wal_level changes work, but changing wal_level through SIGHUP looks
> > > like a cleaner solution. Is there way that we make the GUC SIGHUP but
> > > disallow certain combinations of old and new values?
> >
> > While I agree that it's cleaner I think there is no way today. I think
> > we need to invent something for that.
>
> I would like to summarize the proposed approaches thus far:
>
> Regarding the user interface, there are three approaches:
>
> 1. Implementing SQL function controls (e.g.,
> pg_activate_logical_decoding() and pg_deactivate_logical_decoding()).
> This would enable users to activate logical decoding even with
> wal_level=replica by calling the SQL function. While cloud providers
> seem not like having multiple configuration methods, this could
> potentially be managed through appropriate EXECUTE privileges. Another
> drawback is the user confusion when 'SHOW wal_level' displays
> 'replica' despite processes writing WAL records with logical
> information. This might be dealt with by implementing a show_hook
> function for wal_level.
>
> 2. Implementing automatic logical decoding activation. This would
> trigger upon creation of the first logical slot and deactivate upon
> removal of the final slot. This approach shares the user confusion
> concern of the first proposal. Moreover, it presents a significant
> limitation: users would be unable to utilize logical decoding on
> standby servers without maintaining at least one logical slot on the
> primary -- a substantial disadvantage.
>
> 3. Converting wal_level to a SIGHUP parameter, thereby supporting all
> possible wal_level transition combinations. While this represents the
> most elegant solution among the proposals, it necessitates additional
> development effort for less common scenarios, such as transitioning
> between 'minimal' and 'replica' levels. Such transitions require
> specific handling -- for instance, changing between 'minimal' and
> 'replica' requires a checkpoint, while decreasing from 'replica' to
> 'minimal' necessitates terminating certain processes like WAL senders
> and archiver.
>
> We also had discussion (and I did some research) on the implementation
> of increasing/decreasing wal_level online. The basic idea is that we
> first enable logical information WAL-logging to all processes while
> maintaining the logical decoding in an inactive state. Once we can
> guarantee that all processes are writing WAL records with logical
> information, we enable the logical decoding. This guarantee can be
> achieved by waiting for all concurrent transactions to finish, which
> could make us wait for a long time if a transaction is long-running.
> Another way is to send a global barrier signal and wait for all
> processes to start writing WAL records with logical information. We
> have a good facility for that: EmitProcSignalBarrier() and
> WaitForProcSignalBarrier(). That way, we don't need to wait for
> transaction finishes.
>
> Based on the discussion so far, the idea 3 appears most promising. I
> welcome any additional suggestions or preferences.

I think this is the cleanest solution but harder to implement.
Performing any heavy lifting like waiting for other transactions to
finish or a barrier inside pg_reload_conf() increases the chances of
delaying conf reload. Further, if there are errors, it might cause
some configurations to be not loaded. So the actual processing needs
to happen after the configurations have been loaded.

--
Best Wishes,
Ashutosh Bapat



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Wed, Jan 22, 2025 at 11:42 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> Hi,
>
> On Wed, Jan 22, 2025 at 04:46:00PM -0800, Masahiko Sawada wrote:
> > I would like to summarize the proposed approaches thus far:
>
> Thanks!
>
> > Regarding the user interface, there are three approaches:
> >
> > 1. Implementing SQL function controls (e.g.,
> > pg_activate_logical_decoding() and pg_deactivate_logical_decoding()).
> > This would enable users to activate logical decoding even with
> > wal_level=replica by calling the SQL function. While cloud providers
> > seem not like having multiple configuration methods, this could
> > potentially be managed through appropriate EXECUTE privileges. Another
> > drawback is the user confusion when 'SHOW wal_level' displays
> > 'replica' despite processes writing WAL records with logical
> > information. This might be dealt with by implementing a show_hook
> > function for wal_level.
> >
> > 2. Implementing automatic logical decoding activation. This would
> > trigger upon creation of the first logical slot and deactivate upon
> > removal of the final slot. This approach shares the user confusion
> > concern of the first proposal. Moreover, it presents a significant
> > limitation: users would be unable to utilize logical decoding on
> > standby servers without maintaining at least one logical slot on the
> > primary -- a substantial disadvantage.
>
> Yeah, unless we keep wal_level around but I agree that the following (3.) looks
> like the way to go (as it removes any confusion).
>
> > 3. Converting wal_level to a SIGHUP parameter, thereby supporting all
> > possible wal_level transition combinations. While this represents the
> > most elegant solution among the proposals,
>
> +1
>
> > it necessitates additional
> > development effort for less common scenarios, such as transitioning
> > between 'minimal' and 'replica' levels. Such transitions require
> > specific handling -- for instance, changing between 'minimal' and
> > 'replica' requires a checkpoint, while decreasing from 'replica' to
> > 'minimal' necessitates terminating certain processes like WAL senders
> > and archiver.
>
> Yeah. OTOH switching from replica to minimal is "dangerous" as it makes
> previous base backups unusable for point-in-time recovery. So I wonder if it
> wouldn't be better to keep a restart mandatory depending of the transition
> state (that would probably make users thinking "twice" before doing the
> transition that requires a restart). I don't think any GUC does that already but
> that might be something to explore, thoughts?

I'm concerned that such inconsistency could introduce confusion for
users. The outcome by changing wal_level to minimal is already
documented, and I guess users would check what parameters are going to
be changed even before reloading the config file. I'm not sure that
requiring a server restart only for lowering wal_level to 'minimal'
would really be a protection for unexpectedly making previous base
backups unusable.

It might make sense to raise a WARNING when ALTER SYSTEM lowers the
wal_level to 'minimal'.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



RE: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
"Hayato Kuroda (Fujitsu)"
Дата:
Dear Sawada-san,

I love the idea. I've roughly tested the patch and worked on my env.
Here are initial comments...

1. xloglevelworker.c
```
+#include "replication/logicalxlog.h"
```

xloglevelworker.c includes replication/logicalxlog.h, but it does not exist.
The line had to be removed to build and test it.

2.
```
+static void
+writeUpdateWalLevel(int new_wal_level)
+{
+       XLogBeginInsert();
+       XLogRegisterData((char *) (&new_wal_level), sizeof(bool));
+       XLogInsert(RM_XLOG_ID, XLOG_UPDATE_WAL_LEVEL);
+}
```

IIUC the data length should be sizeof(int) instead of sizeof(bool).

3.
Is there a reason why the process does not wait till the archiver exits?

4.
When I dumped wal files, I found that XLOG_UPDATE_WAL_LEVEL cannot be recognized:

```
rmgr: XLOG        len (rec/tot):     27/    27, tx:          0, lsn: 0/03050838, prev 0/03050800, desc: UNKNOWN (f0)
wal_levellogical
 
```

xlog_identify() must be updated as well.

5.
When I changed "logical" to "replica", postgres outputs like below:

```
LOG:  received SIGHUP, reloading configuration files
LOG:  parameter "wal_level" changed to "replica"
LOG:  wal_level control worker started
LOG:  changing wal_level from "logical" to "replica"
LOG:  wal_level has been decreased to "replica"
LOG:  successfully changed wal_level from "logical" to "replica"
```

ISTM that both postmaster and the wal_level control worker said something like
"wal_level changed", which is bit strange for me. Since GUC can't be renamed,
can we use another name for the wal_level control state?

6.
With the patch present, the wal_level can be changed to the minimal even when the
streaming replication is going. If we do that, the walsender exits immediately and
the below FATAL appears periodically until the standby stops. Same things can be
said for the logical replication:

```
FATAL:  streaming replication receiver "walreceiver" could not connect to the primary server:
connection to server on socket "/tmp/.s.PGSQL.oooo" failed:
FATAL:  WAL senders require "wal_level" to be "replica" or "logical
```

I know this is not a perfect, but can we avoid the issue by reject the GUC update
if the walsender exists? Another approach is not to update the value when replication
slots need to be invalidated.

----------
Best regards,
Haato Kuroda


Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Tue, Jan 28, 2025 at 1:39 AM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Dear Sawada-san,
>
> I love the idea. I've roughly tested the patch and worked on my env.
> Here are initial comments...

Thank you for looking at the patch!

>
> 1. xloglevelworker.c
> ```
> +#include "replication/logicalxlog.h"
> ```
>
> xloglevelworker.c includes replication/logicalxlog.h, but it does not exist.
> The line had to be removed to build and test it.
>
> 2.
> ```
> +static void
> +writeUpdateWalLevel(int new_wal_level)
> +{
> +       XLogBeginInsert();
> +       XLogRegisterData((char *) (&new_wal_level), sizeof(bool));
> +       XLogInsert(RM_XLOG_ID, XLOG_UPDATE_WAL_LEVEL);
> +}
> ```
>
> IIUC the data length should be sizeof(int) instead of sizeof(bool).

Agreed to fix them.

>
> 3.
> Is there a reason why the process does not wait till the archiver exits?

No. I didn't implement this part as the patch was just for
proof-of-concept. I think it would be better to wait for it to exit.

>
> 4.
> When I dumped wal files, I found that XLOG_UPDATE_WAL_LEVEL cannot be recognized:
>
> ```
> rmgr: XLOG        len (rec/tot):     27/    27, tx:          0, lsn: 0/03050838, prev 0/03050800, desc: UNKNOWN (f0)
wal_levellogical 
> ```
>
> xlog_identify() must be updated as well.

Will fix.

>
> 5.
> When I changed "logical" to "replica", postgres outputs like below:
>
> ```
> LOG:  received SIGHUP, reloading configuration files
> LOG:  parameter "wal_level" changed to "replica"
> LOG:  wal_level control worker started
> LOG:  changing wal_level from "logical" to "replica"
> LOG:  wal_level has been decreased to "replica"
> LOG:  successfully changed wal_level from "logical" to "replica"
> ```
>
> ISTM that both postmaster and the wal_level control worker said something like
> "wal_level changed", which is bit strange for me. Since GUC can't be renamed,
> can we use another name for the wal_level control state?

I'm concerned that users could be confused if two different names
refer to substantially the same thing.

Having said that, I guess that we need to drastically change the
messages. For example, I think that the wal_level worker should say
something like "successfully made 'logical' wal_level effective"
instead of saying something like "changed wal_level value". Also,
users might not need gradual messages when increasing 'minimal' to
'logical' or decreasing 'logical' to 'minimal'.

>
> 6.
> With the patch present, the wal_level can be changed to the minimal even when the
> streaming replication is going. If we do that, the walsender exits immediately and
> the below FATAL appears periodically until the standby stops. Same things can be
> said for the logical replication:
>
> ```
> FATAL:  streaming replication receiver "walreceiver" could not connect to the primary server:
> connection to server on socket "/tmp/.s.PGSQL.oooo" failed:
> FATAL:  WAL senders require "wal_level" to be "replica" or "logical
> ```
>
> I know this is not a perfect, but can we avoid the issue by reject the GUC update
> if the walsender exists? Another approach is not to update the value when replication
> slots need to be invalidated.

Does it mean that we reject the config file from being reloaded in
that case? I have no idea how to reject it in a case where the
wal_level in postgresql.conf changed and the user did 'pg_ctl reload'.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Mon, Feb 3, 2025 at 3:40 AM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Dear Sawada-san,
>
> > I'm concerned that users could be confused if two different names
> > refer to substantially the same thing.
> >
> > Having said that, I guess that we need to drastically change the
> > messages. For example, I think that the wal_level worker should say
> > something like "successfully made 'logical' wal_level effective"
> > instead of saying something like "changed wal_level value". Also,
> > users might not need gradual messages when increasing 'minimal' to
> > 'logical' or decreasing 'logical' to 'minimal'.
>
> +1 for something like "successfully made 'logical' wal_level effective", and
> removing gradual messages.
>
> > > 6.
> > > With the patch present, the wal_level can be changed to the minimal even when
> > the
> > > streaming replication is going. If we do that, the walsender exits immediately
> > and
> > > the below FATAL appears periodically until the standby stops. Same things can
> > be
> > > said for the logical replication:
> > >
> > > ```
> > > FATAL:  streaming replication receiver "walreceiver" could not connect to the
> > primary server:
> > > connection to server on socket "/tmp/.s.PGSQL.oooo" failed:
> > > FATAL:  WAL senders require "wal_level" to be "replica" or "logical
> > > ```
> > >
> > > I know this is not a perfect, but can we avoid the issue by reject the GUC update
> > > if the walsender exists? Another approach is not to update the value when
> > replication
> > > slots need to be invalidated.
> >
> > Does it mean that we reject the config file from being reloaded in
> > that case? I have no idea how to reject it in a case where the
> > wal_level in postgresql.conf changed and the user did 'pg_ctl reload'.
>
> I imagined like attached. When I modified wal_level to minimal and send SIGHUP,
> postmaster reported below lines and failed to update wal_level.
>
> ```
> LOG:  received SIGHUP, reloading configuration files
> LOG:  wal_level cannot be set to "minimal" while walsender exists
> LOG:  configuration file "...postgresql.conf" contains errors; unaffected changes were applied
> ```

Interesting, and thanks for sharing the patch. But I think that when
we change the wal_level to 'minimal', there is a window where a new
walsender can launch after passing the check_wal_level() check.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Bertrand Drouvot
Дата:
Hi,

On Tue, Feb 11, 2025 at 02:11:10PM -0800, Masahiko Sawada wrote:
> I've updated the patch that includes comment updates and bug fixes.

Thanks!

> The main idea of changing WAL level online is to decouple two aspects:
> (1) the information included in WAL records and (2) the
> functionalities available at each WAL level. With that, we can change
> the WAL level gradually. For example, when increasing the WAL level
> from 'replica' to 'logical', we first switch the WAL level on the
> shared memory to a new higher level where we allow processes to write
> WAL records with additional information required by the logical
> decoding, while keeping the logical decoding unavailable. The new
> level is something between 'replica' and 'logical'. Once we confirm
> all processes have synchronized to the new level, we increase the WAL
> level further to 'logical', allowing us to start logical decoding. The
> patch supports all combinations of WAL level transitions. It makes
> sense to me to use a background worker to proceed with this transition
> work since we need to wait at some points, rather than delegating it
> to the checkpointer process.

The background worker being added is "wal_level control worker". I wonder if
it would make sense to create a more "generic" one instead (to whom we could 
assign more "tasks" later on, as suggested in the past in [1]).

+   /*
+    * XXX: Perhaps it's not okay that we failed to launch a bgworker and give
+    * up wal_level change because we already reported that the change has
+    * been accepted. Do we need to use aux process instead for that purpose?
+    */
+   if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
+       ereport(WARNING,
+               (errcode(ERRCODE_CONFIGURATION_LIMIT_EXCEEDED),
+                errmsg("out of background worker slots"),
+                errhint("You might need to increase \"%s\".", "max_worker_processes")));

Not sure it has to be an aux process instead as it should be busy in rare occasions.

Maybe we could add some mechanism for ensuring that a bgworker slot is available
when needed (as suggested in [2])?

Not saying it has to be done that way. I just thought that the "wal_level control worker"
could be a perfect use case/starting point for a more generic one but I don't want
to over complicate that thread though.

So maybe just rename "wal_level control worker" to say "custodian worker" and
we could also think about [2]? Feel free to consider all of this as Nits if you
feel it deviates too much from the initial intend of this thread.

[1]: https://www.postgresql.org/message-id/flat/C1EE64B0-D4DB-40F3-98C8-0CED324D34CB%40amazon.com
[2]: https://www.postgresql.org/message-id/1058306.1680467858%40sss.pgh.pa.us

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Tue, Feb 11, 2025 at 11:44 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> Hi,
>
> On Tue, Feb 11, 2025 at 02:11:10PM -0800, Masahiko Sawada wrote:
> > I've updated the patch that includes comment updates and bug fixes.
>
> Thanks!
>
> > The main idea of changing WAL level online is to decouple two aspects:
> > (1) the information included in WAL records and (2) the
> > functionalities available at each WAL level. With that, we can change
> > the WAL level gradually. For example, when increasing the WAL level
> > from 'replica' to 'logical', we first switch the WAL level on the
> > shared memory to a new higher level where we allow processes to write
> > WAL records with additional information required by the logical
> > decoding, while keeping the logical decoding unavailable. The new
> > level is something between 'replica' and 'logical'. Once we confirm
> > all processes have synchronized to the new level, we increase the WAL
> > level further to 'logical', allowing us to start logical decoding. The
> > patch supports all combinations of WAL level transitions. It makes
> > sense to me to use a background worker to proceed with this transition
> > work since we need to wait at some points, rather than delegating it
> > to the checkpointer process.
>
> The background worker being added is "wal_level control worker". I wonder if
> it would make sense to create a more "generic" one instead (to whom we could
> assign more "tasks" later on, as suggested in the past in [1]).
>
> +   /*
> +    * XXX: Perhaps it's not okay that we failed to launch a bgworker and give
> +    * up wal_level change because we already reported that the change has
> +    * been accepted. Do we need to use aux process instead for that purpose?
> +    */
> +   if (!RegisterDynamicBackgroundWorker(&bgw, &bgw_handle))
> +       ereport(WARNING,
> +               (errcode(ERRCODE_CONFIGURATION_LIMIT_EXCEEDED),
> +                errmsg("out of background worker slots"),
> +                errhint("You might need to increase \"%s\".", "max_worker_processes")));
>
> Not sure it has to be an aux process instead as it should be busy in rare occasions.

Thank you for referring to the custodian worker thread. I'm not sure
that online wal_level change work would fit the concept of custodian
worker, which offloads some work for time-critical works such as
checkpointing, but this idea made me think of other possible
directions of this work.

Looking at the latest custodian worker patch, the basic architecture
is to have a single custodian worker and processes can ask it for some
work such as removing logical decoding related files. The online
wal_level change will be the one of the tasks that processes (eps.
checkpointer) can ask for it. On the other hand, one point that I
think might not fit this wal_level work well is that while the
custodian worker is a long-lived worker process, it's sufficient for
the online wal_level change work to have a bgworker that does its work
and then exits. IOW, from the perspective of this work, I prefer the
idea of having one short-lived worker for one task over having one
long-lived worker for multiple tasks. Reading that thread, while we
need to resolve the XID wraparound issue for the work of removing
logical decoding related files, the work of removing temporary files
seems to fit a short-lived worker style. So I thought as one of the
directions, it might be worth considering to have an infrastructure
where we can launch a bgworker just for one task, and we implement the
online wal_level change and temporary files removal on top of it.

> Maybe we could add some mechanism for ensuring that a bgworker slot is available
> when needed (as suggested in [2])?

Yeah, we need this mechanism if we use a bgworker for these works.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Bertrand Drouvot
Дата:
Hi,

On Fri, Feb 14, 2025 at 12:17:48AM -0800, Masahiko Sawada wrote:
> On Tue, Feb 11, 2025 at 11:44 PM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:

> Looking at the latest custodian worker patch, the basic architecture
> is to have a single custodian worker and processes can ask it for some
> work such as removing logical decoding related files. The online
> wal_level change will be the one of the tasks that processes (eps.
> checkpointer) can ask for it. On the other hand, one point that I
> think might not fit this wal_level work well is that while the
> custodian worker is a long-lived worker process,

That was the case initialy but it looks like it would not have been the case
at the end. See, Tom's comment in [1]:

"
I wonder if a single long-lived custodian task is the right model at all.
At least for RemovePgTempFiles, it'd make more sense to write it as a
background worker that spawns, does its work, and then exits,
independently of anything else
"

> it's sufficient for
> the online wal_level change work to have a bgworker that does its work
> and then exits.

Fully agree and I did not think about changing this behavior.

> IOW, from the perspective of this work, I prefer the
> idea of having one short-lived worker for one task over having one
> long-lived worker for multiple tasks.

Yeah, or one short-lived worker for multiple tasks could work too. It just 
starts when it has something to do and then exit.

> Reading that thread, while we
> need to resolve the XID wraparound issue for the work of removing
> logical decoding related files, the work of removing temporary files
> seems to fit a short-lived worker style. So I thought as one of the
> directions, it might be worth considering to have an infrastructure
> where we can launch a bgworker just for one task, and we implement the
> online wal_level change and temporary files removal on top of it.

Yeap, that was exactly my point when I mentioned the custodian thread (taking
into account Tom's comment quoted above).

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Bertrand Drouvot
Дата:
Hi,

On Mon, Feb 17, 2025 at 12:07:56PM -0800, Masahiko Sawada wrote:
> On Fri, Feb 14, 2025 at 2:35 AM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> > Yeap, that was exactly my point when I mentioned the custodian thread (taking
> > into account Tom's comment quoted above).
> >
> 
> I've written PoC patches to have the online wal_level change work use
> a more generic infrastructure. These patches are still in PoC state
> but seem like a good direction to me. Here is a brief explanation for
> each patch.

Thanks for the patches!

> * The 0001 patch introduces "reserved background worker slots". We
> allocate max_process_workers + BGWORKER_CLASS_RESERVED at startup, and
> if the number of running bgworker exceeds max_worker_processes, only
> workers using the reserved slots can be launched. We can request to
> use the reserved slots by adding BGWORKER_CLASS_RESERVED flag at
> bgworker registration.

I had a quick look at 0001 and I think the way that's implemented is reasonnable.
I thought this could be defined through a GUC so that extensions can benefit
from it. But OTOH the core code should ensure the value is > as the number of
reserved slots needed by the core so not using a GUC looks ok to me.

> * The 0002 patch introduces "bgtask worker". The bgtask infrastructure
> is designed to execute internal tasks in background in
> one-worker-per-one-task style. Internally, bgtask workers use the
> reserved bgworker so it's guaranteed that they can launch.

Yeah.

> The
> internal tasks that we can request are predefined and this patch has a
> dummy task as a placeholder. This patch implements only the minimal
> functionality for the online wal_level change work. I've not tested if
> this bgtask infrastructure can be used for tasks that we wanted to
> offload to the custodian worker.

Again, I had a quick look and looks simple enough of our need here. It "just"
executes "(void) InternalBgTasks[type].func()" and then exists. That's, I think,
a good starting point to add more tasks in the future (if we want to).

> * The 0003 patch makes wal_level a SIGHUP parameter. We do the online
> wal_level change work using the bgtask infrastructure. There are no
> major changes from the previous version other than that.

It replaces the dummy task introduced in 0002 by the one that suits our needs
here (through the new BgTaskWalLevelChange() function).

The design looks reasonable to me. Waiting to see if others disagree before
looking more closely at the code.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Wed, Feb 19, 2025 at 1:56 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> Hi,

Thank you for looking at the patches.

>
> On Mon, Feb 17, 2025 at 12:07:56PM -0800, Masahiko Sawada wrote:
> > On Fri, Feb 14, 2025 at 2:35 AM Bertrand Drouvot
> > <bertranddrouvot.pg@gmail.com> wrote:
> > > Yeap, that was exactly my point when I mentioned the custodian thread (taking
> > > into account Tom's comment quoted above).
> > >
> >
> > I've written PoC patches to have the online wal_level change work use
> > a more generic infrastructure. These patches are still in PoC state
> > but seem like a good direction to me. Here is a brief explanation for
> > each patch.
>
> Thanks for the patches!
>
> > * The 0001 patch introduces "reserved background worker slots". We
> > allocate max_process_workers + BGWORKER_CLASS_RESERVED at startup, and
> > if the number of running bgworker exceeds max_worker_processes, only
> > workers using the reserved slots can be launched. We can request to
> > use the reserved slots by adding BGWORKER_CLASS_RESERVED flag at
> > bgworker registration.
>
> I had a quick look at 0001 and I think the way that's implemented is reasonnable.
> I thought this could be defined through a GUC so that extensions can benefit
> from it. But OTOH the core code should ensure the value is > as the number of
> reserved slots needed by the core so not using a GUC looks ok to me.

Interesting idea. I kept the reserved slots only for internal use but
it would be worth considering to use GUC instead.

> > * The 0002 patch introduces "bgtask worker". The bgtask infrastructure
> > is designed to execute internal tasks in background in
> > one-worker-per-one-task style. Internally, bgtask workers use the
> > reserved bgworker so it's guaranteed that they can launch.
>
> Yeah.
>
> > The
> > internal tasks that we can request are predefined and this patch has a
> > dummy task as a placeholder. This patch implements only the minimal
> > functionality for the online wal_level change work. I've not tested if
> > this bgtask infrastructure can be used for tasks that we wanted to
> > offload to the custodian worker.
>
> Again, I had a quick look and looks simple enough of our need here. It "just"
> executes "(void) InternalBgTasks[type].func()" and then exists. That's, I think,
> a good starting point to add more tasks in the future (if we want to).

Yeah, we might want to extend it further, for example to pass an
argument to the background task or to ask multiple tasks for the
single bgtask worker. As far as I can read the custodian patch set,
the work of removing temp files seems not to require any argument
though.

>
> > * The 0003 patch makes wal_level a SIGHUP parameter. We do the online
> > wal_level change work using the bgtask infrastructure. There are no
> > major changes from the previous version other than that.
>
> It replaces the dummy task introduced in 0002 by the one that suits our needs
> here (through the new BgTaskWalLevelChange() function).
>
> The design looks reasonable to me. Waiting to see if others disagree before
> looking more closely at the code.

Thanks.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Thu, Feb 20, 2025 at 10:05 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Feb 19, 2025 at 1:56 AM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> >
> > Hi,
>
> Thank you for looking at the patches.
>
> >
> > On Mon, Feb 17, 2025 at 12:07:56PM -0800, Masahiko Sawada wrote:
> > > On Fri, Feb 14, 2025 at 2:35 AM Bertrand Drouvot
> > > <bertranddrouvot.pg@gmail.com> wrote:
> > > > Yeap, that was exactly my point when I mentioned the custodian thread (taking
> > > > into account Tom's comment quoted above).
> > > >
> > >
> > > I've written PoC patches to have the online wal_level change work use
> > > a more generic infrastructure. These patches are still in PoC state
> > > but seem like a good direction to me. Here is a brief explanation for
> > > each patch.
> >
> > Thanks for the patches!
> >
> > > * The 0001 patch introduces "reserved background worker slots". We
> > > allocate max_process_workers + BGWORKER_CLASS_RESERVED at startup, and
> > > if the number of running bgworker exceeds max_worker_processes, only
> > > workers using the reserved slots can be launched. We can request to
> > > use the reserved slots by adding BGWORKER_CLASS_RESERVED flag at
> > > bgworker registration.
> >
> > I had a quick look at 0001 and I think the way that's implemented is reasonnable.
> > I thought this could be defined through a GUC so that extensions can benefit
> > from it. But OTOH the core code should ensure the value is > as the number of
> > reserved slots needed by the core so not using a GUC looks ok to me.
>
> Interesting idea. I kept the reserved slots only for internal use but
> it would be worth considering to use GUC instead.
>
> > > * The 0002 patch introduces "bgtask worker". The bgtask infrastructure
> > > is designed to execute internal tasks in background in
> > > one-worker-per-one-task style. Internally, bgtask workers use the
> > > reserved bgworker so it's guaranteed that they can launch.
> >
> > Yeah.
> >
> > > The
> > > internal tasks that we can request are predefined and this patch has a
> > > dummy task as a placeholder. This patch implements only the minimal
> > > functionality for the online wal_level change work. I've not tested if
> > > this bgtask infrastructure can be used for tasks that we wanted to
> > > offload to the custodian worker.
> >
> > Again, I had a quick look and looks simple enough of our need here. It "just"
> > executes "(void) InternalBgTasks[type].func()" and then exists. That's, I think,
> > a good starting point to add more tasks in the future (if we want to).
>
> Yeah, we might want to extend it further, for example to pass an
> argument to the background task or to ask multiple tasks for the
> single bgtask worker. As far as I can read the custodian patch set,
> the work of removing temp files seems not to require any argument
> though.
>
> >
> > > * The 0003 patch makes wal_level a SIGHUP parameter. We do the online
> > > wal_level change work using the bgtask infrastructure. There are no
> > > major changes from the previous version other than that.
> >
> > It replaces the dummy task introduced in 0002 by the one that suits our needs
> > here (through the new BgTaskWalLevelChange() function).
> >
> > The design looks reasonable to me. Waiting to see if others disagree before
> > looking more closely at the code.
>
> Thanks.

I would like to discuss behavioral and user interface considerations.

Upon further analysis of this patch regarding the conversion of
wal_level to a SIGHUP parameter, I find that supporting all
combinations of wal_level value changes might make less sense.
Specifically, changing to or from 'minimal' would necessitate a
checkpoint, and reducing wal_level to 'minimal' would require
terminating physical replication, WAL archiving, and online backups.
While these operations demand careful consideration, there seems to be
no compelling use case for decreasing to 'minimal'. Furthermore,
increasing wal_level from 'minimal' is typically a one-time operation
during a database's lifetime. Therefore, we should weigh the benefits
against the implementation complexity.

One solution is to manage the effective WAL level using two distinct
GUC parameters: max_wal_level and wal_level. max_wal_level would be a
POSTMASTER parameter controlling the system's maximum allowable WAL
level, with values 'minimal', 'replica', and 'logical'. wal_level
would function as a SIGHUP parameter managing the runtime WAL level,
accepting values 'replica', 'logical', and 'auto'. The selected value
must be either 'auto' or not exceed max_wal_level. When set to 'auto',
wal_level automatically synchronizes with max_wal_level's value. This
approach would enable online WAL level transitions between 'replica'
and 'logical'.


Regarding logical decoding on standbys, currently both primary and
standby servers must have wal_level set to 'logical'. We need to
determine the appropriate behavior when users decrease the WAL level
from 'logical' to 'replica' through configuration file reload.

One approach would be to invalidate all logical replication slots on
the standby when transitioning to 'replica' WAL level. Although
incoming WAL records from the primary would still be written at
'logical' level, making logical decoding technically feasible, this
behavior seems logical as it reflects the user's intent to discontinue
logical decoding on the standby. For consistency, we might need to
invalidate logical slots during server startup if the WAL level is
insufficient.

Alternatively, we could permit logical decoding on the standby even
with wal_level set to 'replica'. However, this would necessitate
invalidating all logical replication slots during promotion,
potentially extending downtime during failover.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



On Mon, Apr 21, 2025 at 11:01 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> I would like to discuss behavioral and user interface considerations.
>
> Upon further analysis of this patch regarding the conversion of
> wal_level to a SIGHUP parameter, I find that supporting all
> combinations of wal_level value changes might make less sense.
> Specifically, changing to or from 'minimal' would necessitate a
> checkpoint, and reducing wal_level to 'minimal' would require
> terminating physical replication, WAL archiving, and online backups.
> While these operations demand careful consideration, there seems to be
> no compelling use case for decreasing to 'minimal'. Furthermore,
> increasing wal_level from 'minimal' is typically a one-time operation
> during a database's lifetime. Therefore, we should weigh the benefits
> against the implementation complexity.
>
> One solution is to manage the effective WAL level using two distinct
> GUC parameters: max_wal_level and wal_level. max_wal_level would be a
> POSTMASTER parameter controlling the system's maximum allowable WAL
> level, with values 'minimal', 'replica', and 'logical'. wal_level
> would function as a SIGHUP parameter managing the runtime WAL level,
> accepting values 'replica', 'logical', and 'auto'. The selected value
> must be either 'auto' or not exceed max_wal_level. When set to 'auto',
> wal_level automatically synchronizes with max_wal_level's value. This
> approach would enable online WAL level transitions between 'replica'
> and 'logical'.
>
>
> Regarding logical decoding on standbys, currently both primary and
> standby servers must have wal_level set to 'logical'. We need to
> determine the appropriate behavior when users decrease the WAL level
> from 'logical' to 'replica' through configuration file reload.
>
> One approach would be to invalidate all logical replication slots on
> the standby when transitioning to 'replica' WAL level. Although
> incoming WAL records from the primary would still be written at
> 'logical' level, making logical decoding technically feasible, this
> behavior seems logical as it reflects the user's intent to discontinue
> logical decoding on the standby. For consistency, we might need to
> invalidate logical slots during server startup if the WAL level is
> insufficient.
>
> Alternatively, we could permit logical decoding on the standby even
> with wal_level set to 'replica'. However, this would necessitate
> invalidating all logical replication slots during promotion,
> potentially extending downtime during failover.
>

BTW, did we consider the idea to automatically transition to 'logical'
when the first logical slot is created and transition back to
'replica' when last logical slot gets dropped? I see some ideas around
this last time we discussed this topic.

[1] - https://www.postgresql.org/message-id/CAA4eK1J0we5qsZ-ZOwXPbZyvwdWbnT43knO2Cxidia2aHxZSJw%40mail.gmail.com

--
With Regards,
Amit Kapila.



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Wed, Apr 23, 2025 at 5:46 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Apr 21, 2025 at 11:01 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I would like to discuss behavioral and user interface considerations.
> >
> > Upon further analysis of this patch regarding the conversion of
> > wal_level to a SIGHUP parameter, I find that supporting all
> > combinations of wal_level value changes might make less sense.
> > Specifically, changing to or from 'minimal' would necessitate a
> > checkpoint, and reducing wal_level to 'minimal' would require
> > terminating physical replication, WAL archiving, and online backups.
> > While these operations demand careful consideration, there seems to be
> > no compelling use case for decreasing to 'minimal'. Furthermore,
> > increasing wal_level from 'minimal' is typically a one-time operation
> > during a database's lifetime. Therefore, we should weigh the benefits
> > against the implementation complexity.
> >
> > One solution is to manage the effective WAL level using two distinct
> > GUC parameters: max_wal_level and wal_level. max_wal_level would be a
> > POSTMASTER parameter controlling the system's maximum allowable WAL
> > level, with values 'minimal', 'replica', and 'logical'. wal_level
> > would function as a SIGHUP parameter managing the runtime WAL level,
> > accepting values 'replica', 'logical', and 'auto'. The selected value
> > must be either 'auto' or not exceed max_wal_level. When set to 'auto',
> > wal_level automatically synchronizes with max_wal_level's value. This
> > approach would enable online WAL level transitions between 'replica'
> > and 'logical'.
> >
> >
> > Regarding logical decoding on standbys, currently both primary and
> > standby servers must have wal_level set to 'logical'. We need to
> > determine the appropriate behavior when users decrease the WAL level
> > from 'logical' to 'replica' through configuration file reload.
> >
> > One approach would be to invalidate all logical replication slots on
> > the standby when transitioning to 'replica' WAL level. Although
> > incoming WAL records from the primary would still be written at
> > 'logical' level, making logical decoding technically feasible, this
> > behavior seems logical as it reflects the user's intent to discontinue
> > logical decoding on the standby. For consistency, we might need to
> > invalidate logical slots during server startup if the WAL level is
> > insufficient.
> >
> > Alternatively, we could permit logical decoding on the standby even
> > with wal_level set to 'replica'. However, this would necessitate
> > invalidating all logical replication slots during promotion,
> > potentially extending downtime during failover.
> >
>
> BTW, did we consider the idea to automatically transition to 'logical'
> when the first logical slot is created and transition back to
> 'replica' when last logical slot gets dropped? I see some ideas around
> this last time we discussed this topic.

Yes. Bertrand pointed out that a drawback is that the primary server
needs to create a logical slot in order to execute logical decoding on
the standbys[1].

Regards,

[1] https://www.postgresql.org/message-id/Z5DCm6xiBfbUdvX7%40ip-10-97-1-34.eu-west-3.compute.internal

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



On Wed, Apr 23, 2025 at 9:35 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Apr 23, 2025 at 5:46 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > BTW, did we consider the idea to automatically transition to 'logical'
> > when the first logical slot is created and transition back to
> > 'replica' when last logical slot gets dropped? I see some ideas around
> > this last time we discussed this topic.
>
> Yes. Bertrand pointed out that a drawback is that the primary server
> needs to create a logical slot in order to execute logical decoding on
> the standbys[1].
>

True, but if we want to avoid that, we can still keep 'logical' as
wal_level for the ease of users. We can also have another API like the
one you originally proposed (pg_activate_logical_decoding) for the
ease of users. But the minimum requirement would be that one creates a
logical slot to enable logical decoding/replication.

Additionally, shall we do some benchmarking, if not done already, to
show the cases where the performance and WAL volume can hurt users if
we make wal_level as 'logical'?

--
With Regards,
Amit Kapila.



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Thu, Apr 24, 2025 at 5:30 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Apr 23, 2025 at 9:35 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Wed, Apr 23, 2025 at 5:46 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > BTW, did we consider the idea to automatically transition to 'logical'
> > > when the first logical slot is created and transition back to
> > > 'replica' when last logical slot gets dropped? I see some ideas around
> > > this last time we discussed this topic.
> >
> > Yes. Bertrand pointed out that a drawback is that the primary server
> > needs to create a logical slot in order to execute logical decoding on
> > the standbys[1].
> >
>
> True, but if we want to avoid that, we can still keep 'logical' as
> wal_level for the ease of users.

I think we'd like to cover the use case like where users start with
'replica' on the primary and execute logical decoding on the standby
without neither creating a logical slot on the primary nor restarting
the primary.

> We can also have another API like the
> one you originally proposed (pg_activate_logical_decoding) for the
> ease of users. But the minimum requirement would be that one creates a
> logical slot to enable logical decoding/replication.

I think we want to avoid the runtime WAL level automatically decreased
to 'replica' once all logical slots are removed, if users still want
to execute logical decoding on only the standby. One idea is that if
users enable logical decoding using pg_activate_logical_decoding(),
the runtime WAL level doesn't decrease to 'replica' even if all
logical slots are removed. But it would require for us to remember how
the logical decoding has been enabled in a permanent way. Also, I'm
concerned that having three ways to enable logical decoding could
confuse users: wal_level GUC parameter, creating at least one logical
slot, and pg_activate_logical_decoding().

> Additionally, shall we do some benchmarking, if not done already, to
> show the cases where the performance and WAL volume can hurt users if
> we make wal_level as 'logical'?

I believe it would be significant especially for REPLICA IDENTITY FULL
tables. I agree it's worth benchmarking it but I guess the result
would not convince us to make 'logical' default.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Hi,

On Mon, Apr 21, 2025 at 10:31:03AM -0700, Masahiko Sawada wrote:
> I would like to discuss behavioral and user interface considerations.
> 
> Upon further analysis of this patch regarding the conversion of
> wal_level to a SIGHUP parameter, I find that supporting all
> combinations of wal_level value changes might make less sense.
> Specifically, changing to or from 'minimal' would necessitate a
> checkpoint, and reducing wal_level to 'minimal' would require
> terminating physical replication, WAL archiving, and online backups.
> While these operations demand careful consideration, there seems to be
> no compelling use case for decreasing to 'minimal'. Furthermore,
> increasing wal_level from 'minimal' is typically a one-time operation
> during a database's lifetime. Therefore, we should weigh the benefits
> against the implementation complexity.

Agree.

> One solution is to manage the effective WAL level using two distinct
> GUC parameters: max_wal_level and wal_level. max_wal_level would be a
> POSTMASTER parameter controlling the system's maximum allowable WAL
> level, with values 'minimal', 'replica', and 'logical'. wal_level
> would function as a SIGHUP parameter managing the runtime WAL level,
> accepting values 'replica', 'logical', and 'auto'. The selected value
> must be either 'auto' or not exceed max_wal_level. When set to 'auto',
> wal_level automatically synchronizes with max_wal_level's value. This
> approach would enable online WAL level transitions between 'replica'
> and 'logical'.

That makes sense to me. I think that 'logical' could be the default value
for max_wal_level and 'replica' the default for wal_level.
I think that would provide almost the same user experience as currently and would
allow replica->logical change without restart. Thoughts?

> Regarding logical decoding on standbys, currently both primary and
> standby servers must have wal_level set to 'logical'. We need to
> determine the appropriate behavior when users decrease the WAL level
> from 'logical' to 'replica' through configuration file reload.
> 
> One approach would be to invalidate all logical replication slots on
> the standby when transitioning to 'replica' WAL level. Although
> incoming WAL records from the primary would still be written at
> 'logical' level, making logical decoding technically feasible, this
> behavior seems logical as it reflects the user's intent to discontinue
> logical decoding on the standby.

+1

> For consistency, we might need to
> invalidate logical slots during server startup if the WAL level is
> insufficient.

Not sure. Currently we'd not allow the standby to start:

"
LOG:  entering standby mode
FATAL:  logical replication slot "logical_slot" exists, but "wal_level" < "logical"
HINT:  Change "wal_level" to be "logical" or higher.
LOG:  startup process (PID 1790508) exited with exit code 1
"

I think that's a good guard for configuration change mistakes. If that's a mistake
change back to logical and start. If that's not a mistake then change back to
logical, start, change with SIGHUP. OTOH I also see the benefits of being consistent
between SIGHUP and start.

> Alternatively, we could permit logical decoding on the standby even
> with wal_level set to 'replica'.

Yeah, technically speaking we could as the WALs are coming from the primary (that
has wal_level set to logical).

> However, this would necessitate
> invalidating all logical replication slots during promotion,
> potentially extending downtime during failover.

Yeah, I'm tempted to vote to not allow logical decoding on the standby if the
wal_level is not logical.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



On Thu, Apr 24, 2025 at 11:14 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Apr 24, 2025 at 5:30 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, Apr 23, 2025 at 9:35 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Wed, Apr 23, 2025 at 5:46 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > BTW, did we consider the idea to automatically transition to 'logical'
> > > > when the first logical slot is created and transition back to
> > > > 'replica' when last logical slot gets dropped? I see some ideas around
> > > > this last time we discussed this topic.
> > >
> > > Yes. Bertrand pointed out that a drawback is that the primary server
> > > needs to create a logical slot in order to execute logical decoding on
> > > the standbys[1].
> > >
> >
> > True, but if we want to avoid that, we can still keep 'logical' as
> > wal_level for the ease of users.
>
> I think we'd like to cover the use case like where users start with
> 'replica' on the primary and execute logical decoding on the standby
> without neither creating a logical slot on the primary nor restarting
> the primary.
>

Okay, if we introduce a SIGHUP GUC like max_wal_level as you are
proposing, the above requirement will be fulfilled, right? The other
way is by API pg_activate_logical_decoding().

> > We can also have another API like the
> > one you originally proposed (pg_activate_logical_decoding) for the
> > ease of users. But the minimum requirement would be that one creates a
> > logical slot to enable logical decoding/replication.
>
> I think we want to avoid the runtime WAL level automatically decreased
> to 'replica' once all logical slots are removed, if users still want
> to execute logical decoding on only the standby. One idea is that if
> users enable logical decoding using pg_activate_logical_decoding(),
> the runtime WAL level doesn't decrease to 'replica' even if all
> logical slots are removed.
>

That makes sense. If we are using an API like
pg_activate_*/pg_deactivate_*, then why add an additional dependency
on the slots?

--
With Regards,
Amit Kapila.



On Tue, May 6, 2025 at 12:19 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> Hi,
>
> On Mon, Apr 21, 2025 at 10:31:03AM -0700, Masahiko Sawada wrote:
> > I would like to discuss behavioral and user interface considerations.
> >
> > Upon further analysis of this patch regarding the conversion of
> > wal_level to a SIGHUP parameter, I find that supporting all
> > combinations of wal_level value changes might make less sense.
> > Specifically, changing to or from 'minimal' would necessitate a
> > checkpoint, and reducing wal_level to 'minimal' would require
> > terminating physical replication, WAL archiving, and online backups.
> > While these operations demand careful consideration, there seems to be
> > no compelling use case for decreasing to 'minimal'. Furthermore,
> > increasing wal_level from 'minimal' is typically a one-time operation
> > during a database's lifetime. Therefore, we should weigh the benefits
> > against the implementation complexity.
>
> Agree.
>
> > One solution is to manage the effective WAL level using two distinct
> > GUC parameters: max_wal_level and wal_level. max_wal_level would be a
> > POSTMASTER parameter controlling the system's maximum allowable WAL
> > level, with values 'minimal', 'replica', and 'logical'. wal_level
> > would function as a SIGHUP parameter managing the runtime WAL level,
> > accepting values 'replica', 'logical', and 'auto'. The selected value
> > must be either 'auto' or not exceed max_wal_level. When set to 'auto',
> > wal_level automatically synchronizes with max_wal_level's value. This
> > approach would enable online WAL level transitions between 'replica'
> > and 'logical'.
>
> That makes sense to me. I think that 'logical' could be the default value
> for max_wal_level and 'replica' the default for wal_level.
> I think that would provide almost the same user experience as currently and would
> allow replica->logical change without restart. Thoughts?

Sounds reasonable default values. One thing we might want to note is
that when users want to set 'minimal', they would need to change both
parameters. With the defaults value wal_level='auto', users would need
to simply set max_wal_level='minimal'. However, given that it's more
common for users to start with 'replica' than 'minimal', probably it
would make sense to have the default values you suggested. This
combination of WAL levels maximize the benefit of this idea while
maintaining the default behavior.

One downside of this idea is that we introduce another GUC parameter,
resulting in that users would need to control WAL level using two
parameters, which might not be quite clear for users.

>
> > Regarding logical decoding on standbys, currently both primary and
> > standby servers must have wal_level set to 'logical'. We need to
> > determine the appropriate behavior when users decrease the WAL level
> > from 'logical' to 'replica' through configuration file reload.
> >
> > One approach would be to invalidate all logical replication slots on
> > the standby when transitioning to 'replica' WAL level. Although
> > incoming WAL records from the primary would still be written at
> > 'logical' level, making logical decoding technically feasible, this
> > behavior seems logical as it reflects the user's intent to discontinue
> > logical decoding on the standby.
>
> +1
>
> > For consistency, we might need to
> > invalidate logical slots during server startup if the WAL level is
> > insufficient.
>
> Not sure. Currently we'd not allow the standby to start:
>
> "
> LOG:  entering standby mode
> FATAL:  logical replication slot "logical_slot" exists, but "wal_level" < "logical"
> HINT:  Change "wal_level" to be "logical" or higher.
> LOG:  startup process (PID 1790508) exited with exit code 1
> "
>
> I think that's a good guard for configuration change mistakes. If that's a mistake
> change back to logical and start. If that's not a mistake then change back to
> logical, start, change with SIGHUP. OTOH I also see the benefits of being consistent
> between SIGHUP and start.

I see your point.

An alternative idea would be to somehow prevent wal_level from being
changed to 'replica' while there is at least one active logical slot
on the standby. I'll consider this idea too.

Regards,

--
Masahiko Sawada

Amazon Web Services: https://aws.amazon.com



On Tue, May 6, 2025 at 11:59 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Apr 24, 2025 at 11:14 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Thu, Apr 24, 2025 at 5:30 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Wed, Apr 23, 2025 at 9:35 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Wed, Apr 23, 2025 at 5:46 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > > BTW, did we consider the idea to automatically transition to 'logical'
> > > > > when the first logical slot is created and transition back to
> > > > > 'replica' when last logical slot gets dropped? I see some ideas around
> > > > > this last time we discussed this topic.
> > > >
> > > > Yes. Bertrand pointed out that a drawback is that the primary server
> > > > needs to create a logical slot in order to execute logical decoding on
> > > > the standbys[1].
> > > >
> > >
> > > True, but if we want to avoid that, we can still keep 'logical' as
> > > wal_level for the ease of users.
> >
> > I think we'd like to cover the use case like where users start with
> > 'replica' on the primary and execute logical decoding on the standby
> > without neither creating a logical slot on the primary nor restarting
> > the primary.
> >
>
> Okay, if we introduce a SIGHUP GUC like max_wal_level as you are
> proposing, the above requirement will be fulfilled, right?

Right. Both the primary and the standby can increase WAL level to
'logical' without server restart nor creating a logical slot.

> The other
> way is by API pg_activate_logical_decoding().

Yes. This approach would be simpler than the current proposal as we
don't need other new infrastructure such as executing a task in the
background. However, we might want to note that wal_level value would
no longer show the actual runtime WAL level if the logical decoding is
activated via this API. Probably it's better to introduce a read-only
GUC, say runtime_wal_level, showing the actual WAL level. Also,
Ashutosh pointed out[1] before that cloud providers do not like
multiple ways of changing configuration esp. when they can not control
it. But I'm not sure this applies to the API as it's a SQL function
whose access privilege can be controlled.

>
> > > We can also have another API like the
> > > one you originally proposed (pg_activate_logical_decoding) for the
> > > ease of users. But the minimum requirement would be that one creates a
> > > logical slot to enable logical decoding/replication.
> >
> > I think we want to avoid the runtime WAL level automatically decreased
> > to 'replica' once all logical slots are removed, if users still want
> > to execute logical decoding on only the standby. One idea is that if
> > users enable logical decoding using pg_activate_logical_decoding(),
> > the runtime WAL level doesn't decrease to 'replica' even if all
> > logical slots are removed.
> >
>
> That makes sense. If we are using an API like
> pg_activate_*/pg_deactivate_*, then why add an additional dependency
> on the slots?

I thought that we need to remember how logical decoding got enabled
because otherwise even if we enable logical decoding using the API,
it's disabled to 'replica' if all logical slots get removed. So the
idea I mentioned above is that we somehow prevent logical decoding
from being disabled even if all logical slots are removed. If we're
using only these APIs to enable/disable logical decoding, we don't
need to add a dependency on the slots, although we probably want to
disallow disabling logical decoding if there is at least one active
logical slot.

Regards,

[1] https://www.postgresql.org/message-id/CAExHW5tyJrdjqKFQ%2BqDs8Yq3E_P1Fj_T4pwVW9WACmMznRtDuw%40mail.gmail.com

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



On Thu, May 8, 2025 at 1:06 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, May 6, 2025 at 11:59 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Apr 24, 2025 at 11:14 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Thu, Apr 24, 2025 at 5:30 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Wed, Apr 23, 2025 at 9:35 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > >
> > > > > On Wed, Apr 23, 2025 at 5:46 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > >
> > > > > > BTW, did we consider the idea to automatically transition to 'logical'
> > > > > > when the first logical slot is created and transition back to
> > > > > > 'replica' when last logical slot gets dropped? I see some ideas around
> > > > > > this last time we discussed this topic.
> > > > >
> > > > > Yes. Bertrand pointed out that a drawback is that the primary server
> > > > > needs to create a logical slot in order to execute logical decoding on
> > > > > the standbys[1].
> > > > >
> > > >
> > > > True, but if we want to avoid that, we can still keep 'logical' as
> > > > wal_level for the ease of users.
> > >
> > > I think we'd like to cover the use case like where users start with
> > > 'replica' on the primary and execute logical decoding on the standby
> > > without neither creating a logical slot on the primary nor restarting
> > > the primary.
> > >
> >
> > Okay, if we introduce a SIGHUP GUC like max_wal_level as you are
> > proposing, the above requirement will be fulfilled, right?
>
> Right. Both the primary and the standby can increase WAL level to
> 'logical' without server restart nor creating a logical slot.
>
> > The other
> > way is by API pg_activate_logical_decoding().
>
> Yes. This approach would be simpler than the current proposal as we
> don't need other new infrastructure such as executing a task in the
> background.
>

Right, but to an extent, this is also similar to having a requirement
of a logical slot on the primary. Now, it seems to me that the point
you are trying to make is that to allow logical decoding on standby,
it is okay to ask users to use pg_activate_logical_decoding() on
primary, but it would be inconvenient to ask them to have a logical
slot on primary instead. If my understanding is correct, then why do
you think so? We recommend that users have a physical slot on primary
and use it via primary_slot_name on standby to control resource
removal, so why can't we ask them to have a logical slot on primary to
allow logical decoding on standby?


> However, we might want to note that wal_level value would
> no longer show the actual runtime WAL level if the logical decoding is
> activated via this API. Probably it's better to introduce a read-only
> GUC, say runtime_wal_level, showing the actual WAL level.
>

Yeah, we need some way to show the correct value. In one of the
previous emails on this thread, you mentioned that we can use
show_hook to show the correct value. I see that show_in_hot_standby()
uses in_memory value to show the correct value. Do you have something
like that in your mind?

BTW, what is your idea to preserve the state to allow logical decoding
across server restart when the user uses the API, do we want to
persist the state in some way, if so, how? OTOH, if we use the idea to
have a logical slot to allow decoding, then the presence of a logical
slot can tell us whether we need to enable the new state to allow
logical decoding after restart.

> Also,
> Ashutosh pointed out[1] before that cloud providers do not like
> multiple ways of changing configuration esp. when they can not control
> it. But I'm not sure this applies to the API as it's a SQL function
> whose access privilege can be controlled.
>

By multiple ways, do we mean to say that one way for users would be to
use the existing way (change wal_level to logical and restart server),
and the other way would be to use the new API (or have a logical
slot)? But won't similarly users have multiple ways to retain WAL for
standby servers (either by using wal_keep_size or by having a
primary_slot_name). The other example is that one can either manually
change postgresql.conf file or use ALTER SYSTEM to change it, and then
reloadthe  config or restart the server for the change to take effect.
There could be other similar examples as well if one tries to list all
such possibilities. I feel one should be concerned if we are trying to
make both wal_level GUC as SIGHUP, and also try to provide an API to
enable logical decoding.

> > >
> >
> > That makes sense. If we are using an API like
> > pg_activate_*/pg_deactivate_*, then why add an additional dependency
> > on the slots?
>
> I thought that we need to remember how logical decoding got enabled
> because otherwise even if we enable logical decoding using the API,
> it's disabled to 'replica' if all logical slots get removed. So the
> idea I mentioned above is that we somehow prevent logical decoding
> from being disabled even if all logical slots are removed. If we're
> using only these APIs to enable/disable logical decoding, we don't
> need to add a dependency on the slots, although we probably want to
> disallow disabling logical decoding if there is at least one active
> logical slot.
>

Yeah, this is a detail that should be discussed once we finalize the
API to enable logical decoding on both primary and standby without
restarting the primary server.

--
With Regards,
Amit Kapila.



On Sat, May 10, 2025 at 12:00 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, May 8, 2025 at 1:06 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Tue, May 6, 2025 at 11:59 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Thu, Apr 24, 2025 at 11:14 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Thu, Apr 24, 2025 at 5:30 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > > On Wed, Apr 23, 2025 at 9:35 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > > >
> > > > > > On Wed, Apr 23, 2025 at 5:46 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > > > >
> > > > > > > BTW, did we consider the idea to automatically transition to 'logical'
> > > > > > > when the first logical slot is created and transition back to
> > > > > > > 'replica' when last logical slot gets dropped? I see some ideas around
> > > > > > > this last time we discussed this topic.
> > > > > >
> > > > > > Yes. Bertrand pointed out that a drawback is that the primary server
> > > > > > needs to create a logical slot in order to execute logical decoding on
> > > > > > the standbys[1].
> > > > > >
> > > > >
> > > > > True, but if we want to avoid that, we can still keep 'logical' as
> > > > > wal_level for the ease of users.
> > > >
> > > > I think we'd like to cover the use case like where users start with
> > > > 'replica' on the primary and execute logical decoding on the standby
> > > > without neither creating a logical slot on the primary nor restarting
> > > > the primary.
> > > >
> > >
> > > Okay, if we introduce a SIGHUP GUC like max_wal_level as you are
> > > proposing, the above requirement will be fulfilled, right?
> >
> > Right. Both the primary and the standby can increase WAL level to
> > 'logical' without server restart nor creating a logical slot.
> >
> > > The other
> > > way is by API pg_activate_logical_decoding().
> >
> > Yes. This approach would be simpler than the current proposal as we
> > don't need other new infrastructure such as executing a task in the
> > background.
> >
>
> Right, but to an extent, this is also similar to having a requirement
> of a logical slot on the primary. Now, it seems to me that the point
> you are trying to make is that to allow logical decoding on standby,
> it is okay to ask users to use pg_activate_logical_decoding() on
> primary, but it would be inconvenient to ask them to have a logical
> slot on primary instead. If my understanding is correct, then why do
> you think so? We recommend that users have a physical slot on primary
> and use it via primary_slot_name on standby to control resource
> removal, so why can't we ask them to have a logical slot on primary to
> allow logical decoding on standby?

I was thinking of a simple use case where users do logical decoding
from the physical standby. That is, the primary has a physical slot
and the standby uses it via primary_slot_name, and the subscriber
connects the standby server for logical replication with a logical
slot on the standby. In this case, IIUC we need to require users to
create a logical slot on the primary in order just to increase WAL
level to 'logical', but it doesn't make sense to me. No one is going
to use this logical slot and the primary ends up accumulating WALs.

> > However, we might want to note that wal_level value would
> > no longer show the actual runtime WAL level if the logical decoding is
> > activated via this API. Probably it's better to introduce a read-only
> > GUC, say runtime_wal_level, showing the actual WAL level.
> >
>
> Yeah, we need some way to show the correct value. In one of the
> previous emails on this thread, you mentioned that we can use
> show_hook to show the correct value. I see that show_in_hot_standby()
> uses in_memory value to show the correct value. Do you have something
> like that in your mind?

Yes.

> BTW, what is your idea to preserve the state to allow logical decoding
> across server restart when the user uses the API, do we want to
> persist the state in some way, if so, how? OTOH, if we use the idea to
> have a logical slot to allow decoding, then the presence of a logical
> slot can tell us whether we need to enable the new state to allow
> logical decoding after restart.

I vaguely thought of storing such information in the control file or a
checkpoint record along with wal_level value. But I've not seriously
considered how to implement this idea as I don't think it's not a good
user interface.

>
> > Also,
> > Ashutosh pointed out[1] before that cloud providers do not like
> > multiple ways of changing configuration esp. when they can not control
> > it. But I'm not sure this applies to the API as it's a SQL function
> > whose access privilege can be controlled.
> >
>
> By multiple ways, do we mean to say that one way for users would be to
> use the existing way (change wal_level to logical and restart server),
> and the other way would be to use the new API (or have a logical
> slot)?

Yes, that's my understanding of his comment.

> But won't similarly users have multiple ways to retain WAL for
> standby servers (either by using wal_keep_size or by having a
> primary_slot_name). The other example is that one can either manually
> change postgresql.conf file or use ALTER SYSTEM to change it, and then
> reloadthe  config or restart the server for the change to take effect.
> There could be other similar examples as well if one tries to list all
> such possibilities.

True. I think allow_alter_system was introduced for users who don't
want to let their end users change the configuration via ALTER SYSTEM
command. Since the new API we're considering is an SQL function we
already have a way to control its access for such users.

> I feel one should be concerned if we are trying to
> make both wal_level GUC as SIGHUP, and also try to provide an API to
> enable logical decoding.

Agreed.

>
> > > >
> > >
> > > That makes sense. If we are using an API like
> > > pg_activate_*/pg_deactivate_*, then why add an additional dependency
> > > on the slots?
> >
> > I thought that we need to remember how logical decoding got enabled
> > because otherwise even if we enable logical decoding using the API,
> > it's disabled to 'replica' if all logical slots get removed. So the
> > idea I mentioned above is that we somehow prevent logical decoding
> > from being disabled even if all logical slots are removed. If we're
> > using only these APIs to enable/disable logical decoding, we don't
> > need to add a dependency on the slots, although we probably want to
> > disallow disabling logical decoding if there is at least one active
> > logical slot.
> >
>
> Yeah, this is a detail that should be discussed once we finalize the
> API to enable logical decoding on both primary and standby without
> restarting the primary server.

Agreed.

Another approach we might need to consider is to convert wal_level to
a SIGHUP parameter. While I mentioned that supporting all combinations
of wal_level value changes might make less sense for its complexity, I
think this is the most straightforward approach and interface. So it
might be worth trying to implement this approach to figure out the
actual complexity.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



On Sat, May 10, 2025 at 1:05 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Sat, May 10, 2025 at 12:00 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> >
> > Right, but to an extent, this is also similar to having a requirement
> > of a logical slot on the primary. Now, it seems to me that the point
> > you are trying to make is that to allow logical decoding on standby,
> > it is okay to ask users to use pg_activate_logical_decoding() on
> > primary, but it would be inconvenient to ask them to have a logical
> > slot on primary instead. If my understanding is correct, then why do
> > you think so? We recommend that users have a physical slot on primary
> > and use it via primary_slot_name on standby to control resource
> > removal, so why can't we ask them to have a logical slot on primary to
> > allow logical decoding on standby?
>
> I was thinking of a simple use case where users do logical decoding
> from the physical standby. That is, the primary has a physical slot
> and the standby uses it via primary_slot_name, and the subscriber
> connects the standby server for logical replication with a logical
> slot on the standby. In this case, IIUC we need to require users to
> create a logical slot on the primary in order just to increase WAL
> level to 'logical', but it doesn't make sense to me. No one is going
> to use this logical slot and the primary ends up accumulating WALs.
>

Can we have a parameter like immediately_reserve in
create_logical_slot API, similar to what we have for physical slots?
We need to work out the details, but that should address the kind of
use case you are worried about, unless I am missing something.

--
With Regards,
Amit Kapila.



On Sat, May 10, 2025 at 7:08 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sat, May 10, 2025 at 1:05 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Sat, May 10, 2025 at 12:00 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > >
> > > Right, but to an extent, this is also similar to having a requirement
> > > of a logical slot on the primary. Now, it seems to me that the point
> > > you are trying to make is that to allow logical decoding on standby,
> > > it is okay to ask users to use pg_activate_logical_decoding() on
> > > primary, but it would be inconvenient to ask them to have a logical
> > > slot on primary instead. If my understanding is correct, then why do
> > > you think so? We recommend that users have a physical slot on primary
> > > and use it via primary_slot_name on standby to control resource
> > > removal, so why can't we ask them to have a logical slot on primary to
> > > allow logical decoding on standby?
> >
> > I was thinking of a simple use case where users do logical decoding
> > from the physical standby. That is, the primary has a physical slot
> > and the standby uses it via primary_slot_name, and the subscriber
> > connects the standby server for logical replication with a logical
> > slot on the standby. In this case, IIUC we need to require users to
> > create a logical slot on the primary in order just to increase WAL
> > level to 'logical', but it doesn't make sense to me. No one is going
> > to use this logical slot and the primary ends up accumulating WALs.
> >
>
> Can we have a parameter like immediately_reserve in
> create_logical_slot API, similar to what we have for physical slots?
> We need to work out the details, but that should address the kind of
> use case you are worried about, unless I am missing something.

Interesting idea. One concern in my mind is that in the use case I
mentioned above, users would need to carefully manage the extra
logical slot to keep the logical decoding active. The logical decoding
is deactivated on the standby as soon as users drop all logical slots
on the primary.

Also, with this idea of automatically increasing WAL level, do we want
to keep the 'logical' WAL level? If so, it requires an extra step of
creating a non-reserved logical slot on the primary in order for the
standby to activate the logical decoding. On the other hand, we can
also keep the 'logical' WAL level for the compatibility and for making
the logical decoding enabled without the coordination of WAL level
transition. But wal_level GUC parameter would no longer tell the
actual WAL level to users when 'replica' + logical slots. Is it
sufficient to provide a read-only GUC parameter, say
effective_wal_level showing the actual WAL level being used?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



On Sun, May 18, 2025 at 1:09 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Sat, May 10, 2025 at 7:08 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Sat, May 10, 2025 at 1:05 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Sat, May 10, 2025 at 12:00 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > >
> > > > Right, but to an extent, this is also similar to having a requirement
> > > > of a logical slot on the primary. Now, it seems to me that the point
> > > > you are trying to make is that to allow logical decoding on standby,
> > > > it is okay to ask users to use pg_activate_logical_decoding() on
> > > > primary, but it would be inconvenient to ask them to have a logical
> > > > slot on primary instead. If my understanding is correct, then why do
> > > > you think so? We recommend that users have a physical slot on primary
> > > > and use it via primary_slot_name on standby to control resource
> > > > removal, so why can't we ask them to have a logical slot on primary to
> > > > allow logical decoding on standby?
> > >
> > > I was thinking of a simple use case where users do logical decoding
> > > from the physical standby. That is, the primary has a physical slot
> > > and the standby uses it via primary_slot_name, and the subscriber
> > > connects the standby server for logical replication with a logical
> > > slot on the standby. In this case, IIUC we need to require users to
> > > create a logical slot on the primary in order just to increase WAL
> > > level to 'logical', but it doesn't make sense to me. No one is going
> > > to use this logical slot and the primary ends up accumulating WALs.
> > >
> >
> > Can we have a parameter like immediately_reserve in
> > create_logical_slot API, similar to what we have for physical slots?
> > We need to work out the details, but that should address the kind of
> > use case you are worried about, unless I am missing something.
>
> Interesting idea. One concern in my mind is that in the use case I
> mentioned above, users would need to carefully manage the extra
> logical slot to keep the logical decoding active. The logical decoding
> is deactivated on the standby as soon as users drop all logical slots
> on the primary.
>
> Also, with this idea of automatically increasing WAL level, do we want
> to keep the 'logical' WAL level? If so, it requires an extra step of
> creating a non-reserved logical slot on the primary in order for the
> standby to activate the logical decoding. On the other hand, we can
> also keep the 'logical' WAL level for the compatibility and for making
> the logical decoding enabled without the coordination of WAL level
> transition. But wal_level GUC parameter would no longer tell the
> actual WAL level to users when 'replica' + logical slots. Is it
> sufficient to provide a read-only GUC parameter, say
> effective_wal_level showing the actual WAL level being used?
>

Thanks for proposing the idea of making wal_level configurable at
runtime. But why isn't making the relevant GUCs SIGHUP-reloadable
sufficient?

For enabling logical replication, users are already familiar with the
wal_level and max_wal_senders settings. The main issue is that
changing them currently requires a server restart. If we can address
that by making the GUCs reloadable via SIGHUP, that might be enough.

On the other hand, if the goal is to make the behavior fully dynamic,
then we should go all the way, decouple it from wal_level. For
example, we could start logging the extra WAL needed for logical
decoding as soon as a logical slot is created, and stop once all
logical slots are dropped, even if wal_level is still set to logical.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



On Sun, May 18, 2025 at 1:09 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Sat, May 10, 2025 at 7:08 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> >
> > Can we have a parameter like immediately_reserve in
> > create_logical_slot API, similar to what we have for physical slots?
> > We need to work out the details, but that should address the kind of
> > use case you are worried about, unless I am missing something.
>
> Interesting idea. One concern in my mind is that in the use case I
> mentioned above, users would need to carefully manage the extra
> logical slot to keep the logical decoding active. The logical decoding
> is deactivated on the standby as soon as users drop all logical slots
> on the primary.
>

Yes, but the same is true for a physical slot in the case of physical
replication used via primary_slot_name parameter.

> Also, with this idea of automatically increasing WAL level, do we want
> to keep the 'logical' WAL level? If so, it requires an extra step of
> creating a non-reserved logical slot on the primary in order for the
> standby to activate the logical decoding. On the other hand, we can
> also keep the 'logical' WAL level for the compatibility and for making
> the logical decoding enabled without the coordination of WAL level
> transition.

Right, I also feel we should retain both ways to enable logical
replication at least initially. Once we get some feedback, we may
think of removing 'logical' as wal_level.

>  But wal_level GUC parameter would no longer tell the
> actual WAL level to users when 'replica' + logical slots.
>

Right.

> Is it
> sufficient to provide a read-only GUC parameter, say
> effective_wal_level showing the actual WAL level being used?
>

I am not so sure about how we want to communicate this to the user,
but I guess to start with, this is a good idea.

--
With Regards,
Amit Kapila.



On Sun, May 18, 2025 at 4:06 PM Dilip Kumar <dilipbalaut@gmail.com> wrote:
>
> Thanks for proposing the idea of making wal_level configurable at
> runtime. But why isn't making the relevant GUCs SIGHUP-reloadable
> sufficient?
>
> For enabling logical replication, users are already familiar with the
> wal_level and max_wal_senders settings. The main issue is that
> changing them currently requires a server restart. If we can address
> that by making the GUCs reloadable via SIGHUP, that might be enough.
>

Sure, but the challenges are huge. Consider cases like one wants to
change wal_level from 'logical' to 'minimal'. See some analysis of the
same in email [1]. I think it will be a much bigger and challenging
project with diminishing returns as compared to the alternative we are
discussing.

> On the other hand, if the goal is to make the behavior fully dynamic,
> then we should go all the way, decouple it from wal_level. For
> example, we could start logging the extra WAL needed for logical
> decoding as soon as a logical slot is created, and stop once all
> logical slots are dropped, even if wal_level is still set to logical.
>

Yeah, this is one thing that is still under consideration. It is
almost equivalent to removing wal_level as 'logical', which sounds
like a compatibility break, and even if we want to do that, it is
better to attempt that after the base version is committed and we get
a broader consensus on the same.

[1]- https://www.postgresql.org/message-id/CAD21AoAA%3DzuiajwXgXSCYQWo%3D6oY-%3DCGLaEqvpfNUTVsLen%2BCA%40mail.gmail.com

--
With Regards,
Amit Kapila.



On Mon, May 19, 2025 at 2:05 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sun, May 18, 2025 at 1:09 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Sat, May 10, 2025 at 7:08 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > >
> > > Can we have a parameter like immediately_reserve in
> > > create_logical_slot API, similar to what we have for physical slots?
> > > We need to work out the details, but that should address the kind of
> > > use case you are worried about, unless I am missing something.
> >
> > Interesting idea. One concern in my mind is that in the use case I
> > mentioned above, users would need to carefully manage the extra
> > logical slot to keep the logical decoding active. The logical decoding
> > is deactivated on the standby as soon as users drop all logical slots
> > on the primary.
> >
>
> Yes, but the same is true for a physical slot in the case of physical
> replication used via primary_slot_name parameter.

Could you elaborate on this? IIUC the purpose of using a physical slot
in a physical replication case is obvious; users don't want to lose
WAL files necessary for replication. On the other hand, this empty
logical slot needs to be maintained just for keeping the logical
decoding active.

>
> > Also, with this idea of automatically increasing WAL level, do we want
> > to keep the 'logical' WAL level? If so, it requires an extra step of
> > creating a non-reserved logical slot on the primary in order for the
> > standby to activate the logical decoding. On the other hand, we can
> > also keep the 'logical' WAL level for the compatibility and for making
> > the logical decoding enabled without the coordination of WAL level
> > transition.
>
> Right, I also feel we should retain both ways to enable logical
> replication at least initially. Once we get some feedback, we may
> think of removing 'logical' as wal_level.
>
> >  But wal_level GUC parameter would no longer tell the
> > actual WAL level to users when 'replica' + logical slots.
> >
>
> Right.
>
> > Is it
> > sufficient to provide a read-only GUC parameter, say
> > effective_wal_level showing the actual WAL level being used?
> >
>
> I am not so sure about how we want to communicate this to the user,
> but I guess to start with, this is a good idea.

I recently had a discussion with Ashtosh at PGConf.dev regarding an
alternative approach: introducing a new command syntax such as "ALTER
SYSTEM UPDATE wal_level TO 'logical'". In his presentation[1], he
outlined this proposed command as a means to modify specific GUC
parameters synchronously. The backend executing this command would
manage the transition, allowing users to interrupt the process via
Ctrl-C if necessary. In the specific context of wal_level change, this
command could be designed to reject operations like "ALTER SYSTEM
UPDATE wal_level TO 'minimal'" with an error, effectively preventing
undesirable wal_level transitions to or from 'minimal'. While this
approach shares similarities with our previous proposal of
implementing a dedicated SQL function for WAL level modifications, it
offers a more standardized interface for users.

Though I find merit in this proposal, I remain uncertain about its
implementation details and whether it represents the optimal solution
for online wal_level changes, particularly given that our current
approach of automatic WAL level adjustment appears viable. Ashtosh
plans to initiate a separate discussion thread where we can explore
these considerations in greater detail.

Regards,

[1] https://www.pgevents.ca/events/pgconfdev2025/schedule/session/286-changing-shared_buffers-on-the-fly/

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



On Wed, May 21, 2025 at 12:45 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, May 19, 2025 at 2:05 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Sun, May 18, 2025 at 1:09 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Sat, May 10, 2025 at 7:08 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > >
> > > > Can we have a parameter like immediately_reserve in
> > > > create_logical_slot API, similar to what we have for physical slots?
> > > > We need to work out the details, but that should address the kind of
> > > > use case you are worried about, unless I am missing something.
> > >
> > > Interesting idea. One concern in my mind is that in the use case I
> > > mentioned above, users would need to carefully manage the extra
> > > logical slot to keep the logical decoding active. The logical decoding
> > > is deactivated on the standby as soon as users drop all logical slots
> > > on the primary.
> > >
> >
> > Yes, but the same is true for a physical slot in the case of physical
> > replication used via primary_slot_name parameter.
>
> Could you elaborate on this?
>

I am trying to correlate with the case where standby no longer needs
physical slot due to some reason like the standby machine failure, or
say someone uses pg_createsubscriber on standby to make it subscriber,
etc. In such a case, user needs to manually remove the physical slot
on primary. There is difference in both cases but the point is one may
need to manage physical slot as well.

>
> I recently had a discussion with Ashtosh at PGConf.dev regarding an
> alternative approach: introducing a new command syntax such as "ALTER
> SYSTEM UPDATE wal_level TO 'logical'". In his presentation[1], he
> outlined this proposed command as a means to modify specific GUC
> parameters synchronously. The backend executing this command would
> manage the transition, allowing users to interrupt the process via
> Ctrl-C if necessary. In the specific context of wal_level change, this
> command could be designed to reject operations like "ALTER SYSTEM
> UPDATE wal_level TO 'minimal'" with an error, effectively preventing
> undesirable wal_level transitions to or from 'minimal'. While this
> approach shares similarities with our previous proposal of
> implementing a dedicated SQL function for WAL level modifications, it
> offers a more standardized interface for users.
>
> Though I find merit in this proposal, I remain uncertain about its
> implementation details and whether it represents the optimal solution
> for online wal_level changes, particularly given that our current
> approach of automatic WAL level adjustment appears viable.
>

Yeah, I find the idea that the presence of a logical slot will allow
the user to enable logical decoding/replication more appealing than
this new alternative, leaving aside the challenges of realizing it.

--
With Regards,
Amit Kapila.



On Tue, May 20, 2025 at 9:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, May 21, 2025 at 12:45 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Mon, May 19, 2025 at 2:05 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Sun, May 18, 2025 at 1:09 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Sat, May 10, 2025 at 7:08 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > >
> > > > > Can we have a parameter like immediately_reserve in
> > > > > create_logical_slot API, similar to what we have for physical slots?
> > > > > We need to work out the details, but that should address the kind of
> > > > > use case you are worried about, unless I am missing something.
> > > >
> > > > Interesting idea. One concern in my mind is that in the use case I
> > > > mentioned above, users would need to carefully manage the extra
> > > > logical slot to keep the logical decoding active. The logical decoding
> > > > is deactivated on the standby as soon as users drop all logical slots
> > > > on the primary.
> > > >
> > >
> > > Yes, but the same is true for a physical slot in the case of physical
> > > replication used via primary_slot_name parameter.
> >
> > Could you elaborate on this?
> >
>
> I am trying to correlate with the case where standby no longer needs
> physical slot due to some reason like the standby machine failure, or
> say someone uses pg_createsubscriber on standby to make it subscriber,
> etc. In such a case, user needs to manually remove the physical slot
> on primary. There is difference in both cases but the point is one may
> need to manage physical slot as well.

Thank you for clarifying this. I see your point.

> >
> > I recently had a discussion with Ashtosh at PGConf.dev regarding an
> > alternative approach: introducing a new command syntax such as "ALTER
> > SYSTEM UPDATE wal_level TO 'logical'". In his presentation[1], he
> > outlined this proposed command as a means to modify specific GUC
> > parameters synchronously. The backend executing this command would
> > manage the transition, allowing users to interrupt the process via
> > Ctrl-C if necessary. In the specific context of wal_level change, this
> > command could be designed to reject operations like "ALTER SYSTEM
> > UPDATE wal_level TO 'minimal'" with an error, effectively preventing
> > undesirable wal_level transitions to or from 'minimal'. While this
> > approach shares similarities with our previous proposal of
> > implementing a dedicated SQL function for WAL level modifications, it
> > offers a more standardized interface for users.
> >
> > Though I find merit in this proposal, I remain uncertain about its
> > implementation details and whether it represents the optimal solution
> > for online wal_level changes, particularly given that our current
> > approach of automatic WAL level adjustment appears viable.
> >
>
> Yeah, I find the idea that the presence of a logical slot will allow
> the user to enable logical decoding/replication more appealing than
> this new alternative, leaving aside the challenges of realizing it.

I've drafted this idea. Here are summary for attached two patches:

0001 patch allows us to create a logical slot without WAL reservation.

0002 patch is the main patch for dynamically enabling/disabling
logical decoding when wal_level is 'replica'. It's in PoC state and
has a lot of XXX comments. One thing I think we need to consider is
that since disabling the logical decoding needs to write a WAL record
for standbys and happens when dropping the last logical slot which
needs to write a WAL record for standbys, it's possible that we write
a WAL record in a process shutdown during the process exit (e.g.,
ReplicationSlotRelease() and ReplicationSlotCleanup() are called by
ReplicationSlotShmemExit()). It might be safe as long as we do that
during calling before_shmem_exit callback but I'm not sure there is a
chance to do that during calling on_shmem_exit callbacks. It would be
better to somehow lazily disable the logical decoding.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Вложения
On Wed, Jun 4, 2025 at 6:41 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, May 20, 2025 at 9:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > Yeah, I find the idea that the presence of a logical slot will allow
> > the user to enable logical decoding/replication more appealing than
> > this new alternative, leaving aside the challenges of realizing it.

+1. This idea appears more user-friendly and easier to understand
compared to other approaches, such as having multiple GUCs or using
ALTER SYSTEM.

> I've drafted this idea. Here are summary for attached two patches:
>
> 0001 patch allows us to create a logical slot without WAL reservation.
>
> 0002 patch is the main patch for dynamically enabling/disabling
> logical decoding when wal_level is 'replica'.

Thank You for the patches. I have done some initial testing, it seems
to be working well. I will do more testing and review and will share
further feedback.

thanks
Shveta



On Wed, Jun 4, 2025 at 3:40 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Wed, Jun 4, 2025 at 6:41 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Tue, May 20, 2025 at 9:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > Yeah, I find the idea that the presence of a logical slot will allow
> > > the user to enable logical decoding/replication more appealing than
> > > this new alternative, leaving aside the challenges of realizing it.
>
> +1. This idea appears more user-friendly and easier to understand
> compared to other approaches, such as having multiple GUCs or using
> ALTER SYSTEM.
>
> > I've drafted this idea. Here are summary for attached two patches:
> >
> > 0001 patch allows us to create a logical slot without WAL reservation.
> >
> > 0002 patch is the main patch for dynamically enabling/disabling
> > logical decoding when wal_level is 'replica'.
>
> Thank You for the patches. I have done some initial testing, it seems
> to be working well. I will do more testing and review and will share
> further feedback.

I reviewed further and had few concerns:

1)
We now invalidate slots on standby if the primary (with
wal_level=replica) has dropped the last logical slot and internally
reverted its runtime (effective) wal_level back to replica. Consider
the following scenario involving a cascaded logical replication setup:

a) The publisher is configured with wal_level = replica and has
created a publication (pub1).
b) A subscriber server creates a subscription (sub1) to pub1. As part
of the slot creation for sub1, the publisher's effective wal_level is
switched to logical.
c) The publisher also has a physical standby, which in turn has its
own logical subscriber, named standby_sub1.

At this point, everything works as expected i.e. changes from the
publisher flow through the physical standby and are replicated to
standby_sub1. Now if the user drops sub1, the replication slot on the
primary is also dropped. Since this was the last logical slot, the
primary automatically switches its effective wal_level back to
replica. This change propagates to the standby, causing it to
invalidate the slot for standby_sub1. As a result, the standby logs
the following error:

STATEMENT:  START_REPLICATION SLOT "standby_sub1" LOGICAL 0/0 (...)
ERROR:  logical decoding needs to be enabled on the primary

Even if we manually recreate a logical slot on the primary afterward,
the standby_sub1 subscriber is not able to proceed:
ERROR:  can no longer access replication slot "standby_sub1"
DETAIL:  This replication slot has been invalidated due to
"wal_level_insufficient".

So the removal of the logical subscriber for the publisher has somehow
restricted the logical subscriber of standby to work. Is this
behaviour acceptable?

Without this feature, if I manually switch back wal_level to replica
on primary, then it will fail to start. This makes the issue obvious
and prevents misconfiguration.
FATAL:  logical replication slot "sub2" exists, but "wal_level" < "logical"
HINT:  Change "wal_level" to be "logical" or higher.

But the current behaviour is harder to diagnose, as the problem is
effectively hidden behind subscription/slot creation/deletion.

2)
'show effective_wal_level' shows output as 'logical' if a slot exists
on primary. But on physical standby, it still shows it as 'replica'
even in the presence of slots. Is this intentional?

3)
I haven’t tested this yet, but I’d like to discuss what the expected
behavior should be if a slot exists on the primary but is marked as
invalidated. Will an invalidated slot still cause the effective
wal_level to remain at logical, or will invalidating the only logical
slot trigger a switch back to replica?
There is a chance that a slot with un-reserved wal may be invalidated
due to time-out.

thanks
Shveta



On Fri, Jun 6, 2025 at 3:02 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Wed, Jun 4, 2025 at 3:40 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Wed, Jun 4, 2025 at 6:41 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Tue, May 20, 2025 at 9:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > Yeah, I find the idea that the presence of a logical slot will allow
> > > > the user to enable logical decoding/replication more appealing than
> > > > this new alternative, leaving aside the challenges of realizing it.
> >
> > +1. This idea appears more user-friendly and easier to understand
> > compared to other approaches, such as having multiple GUCs or using
> > ALTER SYSTEM.
> >
> > > I've drafted this idea. Here are summary for attached two patches:
> > >
> > > 0001 patch allows us to create a logical slot without WAL reservation.
> > >
> > > 0002 patch is the main patch for dynamically enabling/disabling
> > > logical decoding when wal_level is 'replica'.
> >
> > Thank You for the patches. I have done some initial testing, it seems
> > to be working well. I will do more testing and review and will share
> > further feedback.
>
> I reviewed further and had few concerns:

Thank you for reviewing this feature!

>
> 1)
> We now invalidate slots on standby if the primary (with
> wal_level=replica) has dropped the last logical slot and internally
> reverted its runtime (effective) wal_level back to replica. Consider
> the following scenario involving a cascaded logical replication setup:
>
> a) The publisher is configured with wal_level = replica and has
> created a publication (pub1).
> b) A subscriber server creates a subscription (sub1) to pub1. As part
> of the slot creation for sub1, the publisher's effective wal_level is
> switched to logical.
> c) The publisher also has a physical standby, which in turn has its
> own logical subscriber, named standby_sub1.
>
> At this point, everything works as expected i.e. changes from the
> publisher flow through the physical standby and are replicated to
> standby_sub1. Now if the user drops sub1, the replication slot on the
> primary is also dropped. Since this was the last logical slot, the
> primary automatically switches its effective wal_level back to
> replica. This change propagates to the standby, causing it to
> invalidate the slot for standby_sub1. As a result, the standby logs
> the following error:
>
> STATEMENT:  START_REPLICATION SLOT "standby_sub1" LOGICAL 0/0 (...)
> ERROR:  logical decoding needs to be enabled on the primary
>
> Even if we manually recreate a logical slot on the primary afterward,
> the standby_sub1 subscriber is not able to proceed:
> ERROR:  can no longer access replication slot "standby_sub1"
> DETAIL:  This replication slot has been invalidated due to
> "wal_level_insufficient".
>
> So the removal of the logical subscriber for the publisher has somehow
> restricted the logical subscriber of standby to work. Is this
> behaviour acceptable?
>
> Without this feature, if I manually switch back wal_level to replica
> on primary, then it will fail to start. This makes the issue obvious
> and prevents misconfiguration.
> FATAL:  logical replication slot "sub2" exists, but "wal_level" < "logical"
> HINT:  Change "wal_level" to be "logical" or higher.
>
> But the current behaviour is harder to diagnose, as the problem is
> effectively hidden behind subscription/slot creation/deletion.

The most upstream server in replication configuration would carefully
need to keep having at least one logical slot. One way to keep
effective_wal_level 'logical' on the publisher where wal_level =
'replica' is to have a logical slot without WAL reservation that is
not relevant with any subscriptions. It could require an extra logical
slot but seems workable. Does it resolve this concern?

> 2)
> 'show effective_wal_level' shows output as 'logical' if a slot exists
> on primary. But on physical standby, it still shows it as 'replica'
> even in the presence of slots. Is this intentional?

Yes. I think we should disallow the standbys to create a logical slot
as long as they use wal_level = 'replica', because otherwise the
standby would need to invalidate the logical slot at a promotion.
Which could cause a large down time in a failover case.

> 3)
> I haven’t tested this yet, but I’d like to discuss what the expected
> behavior should be if a slot exists on the primary but is marked as
> invalidated. Will an invalidated slot still cause the effective
> wal_level to remain at logical, or will invalidating the only logical
> slot trigger a switch back to replica?
> There is a chance that a slot with un-reserved wal may be invalidated
> due to time-out.

Good point. I think we don't need to decrease the effective_wal_level
to 'replica' even if we invalidate all logical slots. We need  neither
WAL reservation nor dead tuple retention in order to set
effective_wal_level to 'logical' so I think it's straightforward that
effective_wal_level value depends on only the presence of logical
slots. If dle_replication_slot_timeout affects also logical slots
created with immeidately_reserve=false, we might want to exclude them
to avoid confusion.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



On Sat, Jun 7, 2025 at 2:44 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Fri, Jun 6, 2025 at 3:02 AM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Wed, Jun 4, 2025 at 3:40 PM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > On Wed, Jun 4, 2025 at 6:41 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Tue, May 20, 2025 at 9:54 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > > Yeah, I find the idea that the presence of a logical slot will allow
> > > > > the user to enable logical decoding/replication more appealing than
> > > > > this new alternative, leaving aside the challenges of realizing it.
> > >
> > > +1. This idea appears more user-friendly and easier to understand
> > > compared to other approaches, such as having multiple GUCs or using
> > > ALTER SYSTEM.
> > >
> > > > I've drafted this idea. Here are summary for attached two patches:
> > > >
> > > > 0001 patch allows us to create a logical slot without WAL reservation.
> > > >
> > > > 0002 patch is the main patch for dynamically enabling/disabling
> > > > logical decoding when wal_level is 'replica'.
> > >
> > > Thank You for the patches. I have done some initial testing, it seems
> > > to be working well. I will do more testing and review and will share
> > > further feedback.
> >
> > I reviewed further and had few concerns:
>
> Thank you for reviewing this feature!
>
> >
> > 1)
> > We now invalidate slots on standby if the primary (with
> > wal_level=replica) has dropped the last logical slot and internally
> > reverted its runtime (effective) wal_level back to replica. Consider
> > the following scenario involving a cascaded logical replication setup:
> >
> > a) The publisher is configured with wal_level = replica and has
> > created a publication (pub1).
> > b) A subscriber server creates a subscription (sub1) to pub1. As part
> > of the slot creation for sub1, the publisher's effective wal_level is
> > switched to logical.
> > c) The publisher also has a physical standby, which in turn has its
> > own logical subscriber, named standby_sub1.
> >
> > At this point, everything works as expected i.e. changes from the
> > publisher flow through the physical standby and are replicated to
> > standby_sub1. Now if the user drops sub1, the replication slot on the
> > primary is also dropped. Since this was the last logical slot, the
> > primary automatically switches its effective wal_level back to
> > replica. This change propagates to the standby, causing it to
> > invalidate the slot for standby_sub1. As a result, the standby logs
> > the following error:
> >
> > STATEMENT:  START_REPLICATION SLOT "standby_sub1" LOGICAL 0/0 (...)
> > ERROR:  logical decoding needs to be enabled on the primary
> >
> > Even if we manually recreate a logical slot on the primary afterward,
> > the standby_sub1 subscriber is not able to proceed:
> > ERROR:  can no longer access replication slot "standby_sub1"
> > DETAIL:  This replication slot has been invalidated due to
> > "wal_level_insufficient".
> >
> > So the removal of the logical subscriber for the publisher has somehow
> > restricted the logical subscriber of standby to work. Is this
> > behaviour acceptable?
> >
> > Without this feature, if I manually switch back wal_level to replica
> > on primary, then it will fail to start. This makes the issue obvious
> > and prevents misconfiguration.
> > FATAL:  logical replication slot "sub2" exists, but "wal_level" < "logical"
> > HINT:  Change "wal_level" to be "logical" or higher.
> >
> > But the current behaviour is harder to diagnose, as the problem is
> > effectively hidden behind subscription/slot creation/deletion.
>
> The most upstream server in replication configuration would carefully
> need to keep having at least one logical slot. One way to keep
> effective_wal_level 'logical' on the publisher where wal_level =
> 'replica' is to have a logical slot without WAL reservation that is
> not relevant with any subscriptions. It could require an extra logical
> slot but seems workable. Does it resolve this concern?
>

Yes, I agree that publishers should have a separate slot (not related
with any subscription) without WAL reservation to retain
effective_wal_level as logical when wal_level is replica. But the
question is how can that be ensured? Will it be user's responsibility
to always create that slot? If user has already some subscriptions
subscribing to most upstream server, then while setting up logical
replication on physical standby at a later stage, user will not even
encounter the error:
ERROR: logical decoding needs to be enabled on the primary,
HINT: Set wal_level >= logical or create at least one logical slot on
the primary.

And in lack of such error, users may always end up in the above
explained situation.


> > 2)
> > 'show effective_wal_level' shows output as 'logical' if a slot exists
> > on primary. But on physical standby, it still shows it as 'replica'
> > even in the presence of slots. Is this intentional?
>
> Yes. I think we should disallow the standbys to create a logical slot
> as long as they use wal_level = 'replica', because otherwise the
> standby would need to invalidate the logical slot at a promotion.
> Which could cause a large down time in a failover case.

Do you mean even if primary is running on effective_wal_level=logical,
we shall disallow slot-creation on standby if standby has
wal_level=replica? It means the $subject's enhancement is only valid
on primary?

Or the other way could be that we can have 2 trigger points for
enabling effective_wal_level to logical on primary:
1) One is when a logical slot is created on primary.
2) Another is when a logical slot is created on any of its physical standby.

We need to maintain these 2 separately as drop of last primary's slot
should not toggle it back to replica when any of its  physical
standbys still need it. But if a publisher has multiple physical
standbys, then it will need extra handling i.e. last logical-slot drop
on standby1 should not end up toggling effective_wal_level to replica
when standby2 still has some logical slots.  I am somehow trying to
think of a way where we have that extra slot without the user's
intervention.

>
> > 3)
> > I haven’t tested this yet, but I’d like to discuss what the expected
> > behavior should be if a slot exists on the primary but is marked as
> > invalidated. Will an invalidated slot still cause the effective
> > wal_level to remain at logical, or will invalidating the only logical
> > slot trigger a switch back to replica?
> > There is a chance that a slot with un-reserved wal may be invalidated
> > due to time-out.
>
> Good point. I think we don't need to decrease the effective_wal_level
> to 'replica' even if we invalidate all logical slots. We need  neither
> WAL reservation nor dead tuple retention in order to set
> effective_wal_level to 'logical' so I think it's straightforward that
> effective_wal_level value depends on only the presence of logical
> slots. If dle_replication_slot_timeout affects also logical slots
> created with immeidately_reserve=false, we might want to exclude them
> to avoid confusion.
>

Yes, we shall exclude such slot from timeout based invalidation. As
there are chances that if a slot is invalidated, user may drop it
anytime.

thanks
Shveta



On Wed, Jun 11, 2025 at 2:31 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> I think it's the user's responsibility to keep at least one logical
> slot. It seems that setting wal_level to 'logical' would be the most
> reliable solution for this case. We might want to provide a way to
> keep 'logical' WAL level somehow but I don't have a good idea for now.
>

Okay,  Let me think  more on this.

>
> Considering cascading replication cases too, 2) could be tricky as
> cascaded standbys need to propagate the information of logical slot
> creation up to the most upstream server.
>

Yes, I understand the challenges here.

Thanks for the v2 patches, few concerns:


1)
Now when the slot on standby is invalidated due to effective_wal_level
switched back to replica and if we restart standby, it fails to
restart even if wal_level is explicitly changed to logical in conf
file.

FATAL:  logical replication slot "slot_st" exists, but logical
decoding is not enabled
HINT:  Change "wal_level" to be "replica" or higher.


2)
I see that when primary switches back its effective wal_level to
replica while standby has wal_level=logical in conf file, then standby
has this status:

postgres=# show wal_level;
 wal_level
-----------
 logical

postgres=# show effective_wal_level;
 effective_wal_level
---------------------
 replica

Is this correct? Can effective_wal_level be < wal_level anytime? I
feel it can be greater but never lesser.

3)
When standby invalidate obsolete slots due to effective_wal_level on
primary changed to replica, it dumps below:
LOG:  invalidating obsolete replication slot "slot_st2"
DETAIL:  Logical decoding on standby requires "wal_level" >= "logical"
on the primary server

Shall we update this message as well to convey about slot-presence on primary.
DETAIL:  Logical decoding on standby requires "wal_level" >= "logical"
or presence of logical slot on the primary server.

4)
I see that the slotsync worker is running all the time now as against
the previous behaviour where it will not start if wal_level is less
than logical or switched to '< logical' anytime. Even with wal_level
and effective_wal_level set to replica, slot-sync keeps on attempting
synchronization. This does not look correct. I think we need to find a
way to stop sot-sync worker when effective_wal_level is switched to
replica from logical.

5)
Can you please help me understand the changes at [1].

a) Why is it needed when we have code logic at [2]
b) in [1], why do we check n_inuse_logical_slots on standby and then
make decisions? Why not to disable logical-decoding directly just like
[2]

[1]:
+ if (xlrec.wal_level == WAL_LEVEL_LOGICAL)
+ {
+ /*
+ * If the primary increase WAL level to 'logical', we can
+ * unconditionally enable the logical decoding on the standby.
+ */
+ UpdateLogicalDecodingStatus(true);
+ }
+ else if (xlrec.wal_level == WAL_LEVEL_REPLICA &&
+ pg_atomic_read_u32(&ReplicationSlotCtl->n_inuse_logical_slots) == 0)
+ {
+ /*
+ * Disable the logical decoding if there is no in-use logical slot
+ * on the standby.
+ */
+ UpdateLogicalDecodingStatus(false);
+ }


[2]:
+ else if (info == XLOG_LOGICAL_DECODING_STATUS_CHANGE)
+ {
+ bool logical_decoding;
+
+ memcpy(&logical_decoding, XLogRecGetData(record), sizeof(bool));
+ UpdateLogicalDecodingStatus(logical_decoding);
+
+ /*
+ * Invalidate logical slots if we are in hot standby and the primary
+ * disabled the logical decoding.
+ */
+ if (!logical_decoding && InRecovery && InHotStandby)
+ InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_LEVEL,
+    0, InvalidOid,
+    InvalidTransactionId);
+
+ LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
+ ControlFile->logicalDecodingEnabled = logical_decoding;
+ UpdateControlFile();
+ LWLockRelease(ControlFileLock);
+ }


thanks
Shveta



On Mon, Jun 16, 2025 at 11:48 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Wed, Jun 11, 2025 at 2:31 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I think it's the user's responsibility to keep at least one logical
> > slot. It seems that setting wal_level to 'logical' would be the most
> > reliable solution for this case. We might want to provide a way to
> > keep 'logical' WAL level somehow but I don't have a good idea for now.
> >
>
> Okay,  Let me think  more on this.
>
> >
> > Considering cascading replication cases too, 2) could be tricky as
> > cascaded standbys need to propagate the information of logical slot
> > creation up to the most upstream server.
> >
>
> Yes, I understand the challenges here.
>
> Thanks for the v2 patches, few concerns:

Thank you for the comments!

> 1)
> Now when the slot on standby is invalidated due to effective_wal_level
> switched back to replica and if we restart standby, it fails to
> restart even if wal_level is explicitly changed to logical in conf
> file.
>
> FATAL:  logical replication slot "slot_st" exists, but logical
> decoding is not enabled
> HINT:  Change "wal_level" to be "replica" or higher.

Good catch, we should fix it.
>
> 2)
> I see that when primary switches back its effective wal_level to
> replica while standby has wal_level=logical in conf file, then standby
> has this status:
>
> postgres=# show wal_level;
>  wal_level
> -----------
>  logical
>
> postgres=# show effective_wal_level;
>  effective_wal_level
> ---------------------
>  replica
>
> Is this correct? Can effective_wal_level be < wal_level anytime? I
> feel it can be greater but never lesser.

Hmm, I think we need to define what value we should show in
effective_wal_level on standbys because the standbys actually are not
writing any WALs and whether or not the logical decoding is enabled on
the standbys depends on the primary.

In the previous version patch, the standby's effective_wal_level value
depended solely on the standby's wal_level value. However, it was
confusing in a sense because it's possible that the logical decoding
could be available even though effective_wal_level is 'replica' if the
primary already enables it. One idea is that given that the logical
decoding availability and effective_wal_level value are independent in
principle, it's better to provide a SQL function to get the logical
decoding status so that users can check the logical decoding
availability without checking effective_wal_level. With that function,
it might make sense to revert back the behavior to the previous one.
That is, on the primary the effective_wal_level value is always
greater than or equal to wal_level whereas on the standbys it's always
the same as wal_level, and users would be able to check the logical
decoding availability using the SQL function. Or it might also be
worth considering to show effective_wal_level as NULL on standbys.

>
> 3)
> When standby invalidate obsolete slots due to effective_wal_level on
> primary changed to replica, it dumps below:
> LOG:  invalidating obsolete replication slot "slot_st2"
> DETAIL:  Logical decoding on standby requires "wal_level" >= "logical"
> on the primary server
>
> Shall we update this message as well to convey about slot-presence on primary.
> DETAIL:  Logical decoding on standby requires "wal_level" >= "logical"
> or presence of logical slot on the primary server.

Will fix.

> 4)
> I see that the slotsync worker is running all the time now as against
> the previous behaviour where it will not start if wal_level is less
> than logical or switched to '< logical' anytime. Even with wal_level
> and effective_wal_level set to replica, slot-sync keeps on attempting
> synchronization. This does not look correct. I think we need to find a
> way to stop sot-sync worker when effective_wal_level is switched to
> replica from logical.

Right, will fix.

> 5)
> Can you please help me understand the changes at [1].
>
> a) Why is it needed when we have code logic at [2]

This is because we use XLOG_LOGICAL_DECODING_STATUS_CHANGE record only
for changing the logical decoding status online (i.e., without
restarting the server). So I think we still these part of code in
cases where we enable/disable the logical decoding by changing the
wal_level value with restarting the server

Suppose that both the primary and the standby set wal_level='replica',
the logical decoding is not available on both sides. If the primary
restarts with wal_level='logical', it doesn't write an
XLOG_LOGICAL_DECODING_STATUS_CHANGE record.

Another case is that suppose that the primary sets wal_level='logical'
and the standby sets wal_level='replica', the logical decoding is
available on both sides. If the primary restarts with
wal_level='replica' we need to somehow tell the standby the fact that
the logical decoding gets disabled. (BTW I realized we need to
invalidate the logical slots in this case too).

> b) in [1], why do we check n_inuse_logical_slots on standby and then
> make decisions? Why not to disable logical-decoding directly just like
> [2]

It seems the code is incorrect. We should disable the logical decoding
anyway if the primary disables it. Will fix.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



On Wed, Jun 18, 2025 at 6:06 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> Thank you for the comments!
>
> >
> > 2)
> > I see that when primary switches back its effective wal_level to
> > replica while standby has wal_level=logical in conf file, then standby
> > has this status:
> >
> > postgres=# show wal_level;
> >  wal_level
> > -----------
> >  logical
> >
> > postgres=# show effective_wal_level;
> >  effective_wal_level
> > ---------------------
> >  replica
> >
> > Is this correct? Can effective_wal_level be < wal_level anytime? I
> > feel it can be greater but never lesser.
>
> Hmm, I think we need to define what value we should show in
> effective_wal_level on standbys because the standbys actually are not
> writing any WALs and whether or not the logical decoding is enabled on
> the standbys depends on the primary.
>
> In the previous version patch, the standby's effective_wal_level value
> depended solely on the standby's wal_level value. However, it was
> confusing in a sense because it's possible that the logical decoding
> could be available even though effective_wal_level is 'replica' if the
> primary already enables it. One idea is that given that the logical
> decoding availability and effective_wal_level value are independent in
> principle, it's better to provide a SQL function to get the logical
> decoding status so that users can check the logical decoding
> availability without checking effective_wal_level. With that function,
> it might make sense to revert back the behavior to the previous one.
> That is, on the primary the effective_wal_level value is always
> greater than or equal to wal_level whereas on the standbys it's always
> the same as wal_level, and users would be able to check the logical
> decoding availability using the SQL function. Or it might also be
> worth considering to show effective_wal_level as NULL on standbys.

Yes, that is one idea. It will resolve the confusion.
But I was thinking, instead of having one new GUC + a SQL function,
can we have a GUC alone, which shows logical_decoding status plus the
cause of that. The new GUC will be applicable on both primary and
standby. As an example, let's say we name it as
logical_decoding_status, then it can have these values (
<status>_<cause>):

enabled_wal_level_logical:                                  valid both
for primary, standby
enabled_effective_wal_level_logical:                   valid only for primary
enabled_cascaded_logical_decoding                   valid only for standby
disabled :
  valid both for primary, standby

'enabled_cascaded_logical_decoding'  will indicate that logical
decoding is enabled on standby (even when its own wal_level=replica)
as a cascaded effect from primary. It can be possible either due to
primary's wal_level=logical or logical slot being present on primary.

> >
> > 3)
> > When standby invalidate obsolete slots due to effective_wal_level on
> > primary changed to replica, it dumps below:
> > LOG:  invalidating obsolete replication slot "slot_st2"
> > DETAIL:  Logical decoding on standby requires "wal_level" >= "logical"
> > on the primary server
> >
> > Shall we update this message as well to convey about slot-presence on primary.
> > DETAIL:  Logical decoding on standby requires "wal_level" >= "logical"
> > or presence of logical slot on the primary server.
>
> Will fix.
>
> > 4)
> > I see that the slotsync worker is running all the time now as against
> > the previous behaviour where it will not start if wal_level is less
> > than logical or switched to '< logical' anytime. Even with wal_level
> > and effective_wal_level set to replica, slot-sync keeps on attempting
> > synchronization. This does not look correct. I think we need to find a
> > way to stop sot-sync worker when effective_wal_level is switched to
> > replica from logical.
>
> Right, will fix.
>
> > 5)
> > Can you please help me understand the changes at [1].
> >
> > a) Why is it needed when we have code logic at [2]
>
> This is because we use XLOG_LOGICAL_DECODING_STATUS_CHANGE record only
> for changing the logical decoding status online (i.e., without
> restarting the server). So I think we still these part of code in
> cases where we enable/disable the logical decoding by changing the
> wal_level value with restarting the server
>
> Suppose that both the primary and the standby set wal_level='replica',
> the logical decoding is not available on both sides. If the primary
> restarts with wal_level='logical', it doesn't write an
> XLOG_LOGICAL_DECODING_STATUS_CHANGE record.
>
> Another case is that suppose that the primary sets wal_level='logical'
> and the standby sets wal_level='replica', the logical decoding is
> available on both sides. If the primary restarts with
> wal_level='replica' we need to somehow tell the standby the fact that
> the logical decoding gets disabled.

Okay, I understand it now.

> (BTW I realized we need to
> invalidate the logical slots in this case too).
>

Yes, the behaviour should be the same. The differences in behaviour
for the 2 cases I pointed, confused me at the very first place.

> > b) in [1], why do we check n_inuse_logical_slots on standby and then
> > make decisions? Why not to disable logical-decoding directly just like
> > [2]
>
> It seems the code is incorrect. We should disable the logical decoding
> anyway if the primary disables it. Will fix.
>

I agree. So now case [1] behaviour will be exactly the same as case
[2] i.e. invalidate the slot and don't check slots-usage on standby
before invalidating.

thanks
Shveta



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Bertrand Drouvot
Дата:
Hi,

On Tue, Jun 10, 2025 at 02:00:55PM -0700, Masahiko Sawada wrote:
> > > > > > 0001 patch allows us to create a logical slot without WAL reservation.

Thanks for the patch and sorry to be late in this conversation.

The thing that worry me a bit with this is that that could be easy to attempt
to use the slot "by mistake" and then (as a consequence) trigger WAL reservation
by mistake on the primary. I think that this mistake is more likely to happen
with a logical slot as compared to a physical slot.

IIUC the idea is to "just" increase WAL level to 'logical' so that one could then
be allowed to make use of logical decoding from the standby. The primary goal
of logical decoding from standby is to move some load from the primay to
the standby i.e we don't expect/want the logical slot to be used on the primary.

So what about making sure that if a logical slot is created with immediately_reserve
set to false then no one can use it? (That would ensure that WAL reservation
will not happen).

That said, we might also want to create another parameter name (than
immediately_reserve) to better reflect this behavior (if we move that way).

Thoughts?

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



On Wed, Jun 18, 2025 at 2:39 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> Hi,
>
> On Tue, Jun 10, 2025 at 02:00:55PM -0700, Masahiko Sawada wrote:
> > > > > > > 0001 patch allows us to create a logical slot without WAL reservation.
>
> Thanks for the patch and sorry to be late in this conversation.
>
> The thing that worry me a bit with this is that that could be easy to attempt
> to use the slot "by mistake" and then (as a consequence) trigger WAL reservation
> by mistake on the primary. I think that this mistake is more likely to happen
> with a logical slot as compared to a physical slot.
>

Yes, agreed. Another concern is the possibility of someone
intentionally using it and associating it with a subscription. If the
subscription is later dropped, it could also cause the slot to be
removed. In cases where this is the only slot on the primary, it may
render the standby slots unusable as well. Refer the problem described
in [1]

> IIUC the idea is to "just" increase WAL level to 'logical' so that one could then
> be allowed to make use of logical decoding from the standby. The primary goal
> of logical decoding from standby is to move some load from the primay to
> the standby i.e we don't expect/want the logical slot to be used on the primary.
>
> So what about making sure that if a logical slot is created with immediately_reserve
> set to false then no one can use it? (That would ensure that WAL reservation
> will not happen).
>

+1.
One approach is to reserve a specific slot name for this purpose. If
such a slot is created, we can internally set immediately_reserve to
false and also prevent it from being used. The only permitted user
action on this slot would be to drop it using
pg_drop_replication_slot() instead of being dropped as part of
subscription-drop.
Another concern is ensuring that users actually create this slot. If
there is already an active subscription subscribed to the primary, the
effective_wal_level will be set to logical already, allowing logical
decoding on the standby to proceed without issue. In such a case, the
user might not bother to create additional slots (same as problem
described in [1])) and later may unintentionally end up making standby
slots unusable. Any ideas on how to ensure it?

> That said, we might also want to create another parameter name (than
> immediately_reserve) to better reflect this behavior (if we move that way).
>
> Thoughts?

Or we could avoid exposing control of immediately_reserve to the user
altogether? Instead, we reserve a specific slot name and ensure that
it never reserves WAL in the future by preventing it from being
consumed under any circumstances (as you suggested).

[1]: https://www.postgresql.org/message-id/CAJpy0uDW6BpNXLZ0AaP%3D_GU6pCsZf_7Sk2R0Ti%2Bov%2BEO6ruMkg%40mail.gmail.com

thanks
Shveta



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Bertrand Drouvot
Дата:
Hi,

On Wed, Jun 18, 2025 at 03:22:59PM +0530, shveta malik wrote:
> On Wed, Jun 18, 2025 at 2:39 PM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> >
> > IIUC the idea is to "just" increase WAL level to 'logical' so that one could then
> > be allowed to make use of logical decoding from the standby. The primary goal
> > of logical decoding from standby is to move some load from the primay to
> > the standby i.e we don't expect/want the logical slot to be used on the primary.
> >
> > So what about making sure that if a logical slot is created with immediately_reserve
> > set to false then no one can use it? (That would ensure that WAL reservation
> > will not happen).
> >
> 
> Another concern is ensuring that users actually create this slot. If
> there is already an active subscription subscribed to the primary, the
> effective_wal_level will be set to logical already, allowing logical
> decoding on the standby to proceed without issue. In such a case, the
> user might not bother to create additional slots (same as problem
> described in [1])) and later may unintentionally end up making standby
> slots unusable. Any ideas on how to ensure it?

> > That said, we might also want to create another parameter name (than
> > immediately_reserve) to better reflect this behavior (if we move that way).
> >
> > Thoughts?
> 
> Or we could avoid exposing control of immediately_reserve to the user
> altogether? Instead, we reserve a specific slot name and ensure that
> it never reserves WAL in the future by preventing it from being
> consumed under any circumstances (as you suggested).

I wonder if a way to address the concerns that we shared above is to use a
mixed approach like:

- Forget the immediately_reserve idea
- If a user creates a logical slot then we automatically switch to wal_level =
logical (if not already done): I think that's a nice user experience
- *and* provide a new API pg_activate_logical_decoding(), if the user has no
need to create a logical slot on the primary (wants to use the standby to offload
all the logical decoding)

So if the user also uses a logical slot on the primary (for real..) then there
is no need to launch pg_activate_logical_decoding(), until....:

The user decides to drop the logical slot on the primary, and then:

- If the slot is not the last logical slot, that's fine, drop it
- If the slot is the last logical one AND the user did not set a new flag
"wal_level_action" to "say preserve" or "force downgrade" (in the drop command)
then the drop fails with an informative error message.

That way:

- pg_activate_logical_decoding() is needed only if there is not already a logical
slot on the primary
- the drop requires the user to think twice if this is the last logical slot
- we don't ask the user to create a logical slot if he does not want to use it
on the primary

Thoughts?

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



On Thu, Jun 19, 2025 at 2:30 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> I wonder if a way to address the concerns that we shared above is to use a
> mixed approach like:
>
> - Forget the immediately_reserve idea
> - If a user creates a logical slot then we automatically switch to wal_level =
> logical (if not already done): I think that's a nice user experience
> - *and* provide a new API pg_activate_logical_decoding(), if the user has no
> need to create a logical slot on the primary (wants to use the standby to offload
> all the logical decoding)
>
> So if the user also uses a logical slot on the primary (for real..) then there
> is no need to launch pg_activate_logical_decoding(), until....:
>
> The user decides to drop the logical slot on the primary, and then:
>
> - If the slot is not the last logical slot, that's fine, drop it
> - If the slot is the last logical one AND the user did not set a new flag
> "wal_level_action" to "say preserve" or "force downgrade" (in the drop command)
> then the drop fails with an informative error message.

Overall the plan sounds reasonable one. But we need to think if the
slot is dropped on primary as part of Drop Subscription on subscriber,
then how will the user convey the wal-level preserve option? Giving it
as part of subscription-cmd to preserve wal-level on primary might not
be a good idea.

thanks
Shveta



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Bertrand Drouvot
Дата:
Hi,

On Fri, Jun 20, 2025 at 09:48:47AM +0530, shveta malik wrote:
> On Thu, Jun 19, 2025 at 2:30 PM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> >
> > I wonder if a way to address the concerns that we shared above is to use a
> > mixed approach like:
> >
> > - Forget the immediately_reserve idea
> > - If a user creates a logical slot then we automatically switch to wal_level =
> > logical (if not already done): I think that's a nice user experience
> > - *and* provide a new API pg_activate_logical_decoding(), if the user has no
> > need to create a logical slot on the primary (wants to use the standby to offload
> > all the logical decoding)
> >
> > So if the user also uses a logical slot on the primary (for real..) then there
> > is no need to launch pg_activate_logical_decoding(), until....:
> >
> > The user decides to drop the logical slot on the primary, and then:
> >
> > - If the slot is not the last logical slot, that's fine, drop it
> > - If the slot is the last logical one AND the user did not set a new flag
> > "wal_level_action" to "say preserve" or "force downgrade" (in the drop command)
> > then the drop fails with an informative error message.
> 
> Overall the plan sounds reasonable one.

Thanks for sharing your thoughts!

> But we need to think if the
> slot is dropped on primary as part of Drop Subscription on subscriber,
> then how will the user convey the wal-level preserve option?

If the drop subscription attempts to drop the last logical replication slot
on the primary then it will fail. The "DROP SUBSCRIPTION" doc states:

"
To proceed in this situation, first disable the subscription by executing
ALTER SUBSCRIPTION ... DISABLE, and then disassociate it from the replication slot
by executing ALTER SUBSCRIPTION ... SET (slot_name = NONE).

After that, DROP SUBSCRIPTION will no longer attempt any actions on a remote host.
Note that if the remote replication slot still exists, it (and any related table
synchronization slots) should then be dropped manually; otherwise it/they will
continue to reserve WAL and might eventually cause the disk to fill up.
"

So one option is to drop the logical replication slot manually (providing a valid
wal_level_action value).

That's not the most elegant solution but if the error message is clear enough
(that this is the last logical replication slot and that it has to be
removed manually) that's "doable" to reach a clean state on the publisher
and subscriber sides.

> Giving it
> as part of subscription-cmd to preserve wal-level on primary

Yeah, another option is to make "wal_level_action" part of the "DROP SUBSCRIPTION"
command. In that case a common scenario would be:

- first drop fails because the wal_level_action value has not been specified
- then try to drop again but this time specifying a wal_level_action value

> might not be a good idea.

Agree that it sounds kind of weird and I'm not sure that I like the idea of
giving the "wal level" on the primary control on the subscriber side.

Without it a typical scenario would be:

- drop fails
- ALTER SUBSCRIPTION <> disable
- drop the slot on the primary
- ALTER SUBSCRIPTION <> SET (slot_name = NONE)
- drop succeeds

and that might not be user friendly but it gives the wal level control on the
publisher side (and I think that's better).

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



On Thu, Jun 19, 2025 at 6:00 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> Hi,
>
> On Wed, Jun 18, 2025 at 03:22:59PM +0530, shveta malik wrote:
> > On Wed, Jun 18, 2025 at 2:39 PM Bertrand Drouvot
> > <bertranddrouvot.pg@gmail.com> wrote:
> > >
> > > IIUC the idea is to "just" increase WAL level to 'logical' so that one could then
> > > be allowed to make use of logical decoding from the standby. The primary goal
> > > of logical decoding from standby is to move some load from the primay to
> > > the standby i.e we don't expect/want the logical slot to be used on the primary.
> > >
> > > So what about making sure that if a logical slot is created with immediately_reserve
> > > set to false then no one can use it? (That would ensure that WAL reservation
> > > will not happen).
> > >
> >
> > Another concern is ensuring that users actually create this slot. If
> > there is already an active subscription subscribed to the primary, the
> > effective_wal_level will be set to logical already, allowing logical
> > decoding on the standby to proceed without issue. In such a case, the
> > user might not bother to create additional slots (same as problem
> > described in [1])) and later may unintentionally end up making standby
> > slots unusable. Any ideas on how to ensure it?
>
> > > That said, we might also want to create another parameter name (than
> > > immediately_reserve) to better reflect this behavior (if we move that way).
> > >
> > > Thoughts?
> >
> > Or we could avoid exposing control of immediately_reserve to the user
> > altogether? Instead, we reserve a specific slot name and ensure that
> > it never reserves WAL in the future by preventing it from being
> > consumed under any circumstances (as you suggested).
>
> I wonder if a way to address the concerns that we shared above is to use a
> mixed approach like:
>
> - Forget the immediately_reserve idea
> - If a user creates a logical slot then we automatically switch to wal_level =
> logical (if not already done): I think that's a nice user experience
> - *and* provide a new API pg_activate_logical_decoding(), if the user has no
> need to create a logical slot on the primary (wants to use the standby to offload
> all the logical decoding)
>
> So if the user also uses a logical slot on the primary (for real..) then there
> is no need to launch pg_activate_logical_decoding(), until....:
>
> The user decides to drop the logical slot on the primary, and then:
>
> - If the slot is not the last logical slot, that's fine, drop it
> - If the slot is the last logical one AND the user did not set a new flag
> "wal_level_action" to "say preserve" or "force downgrade" (in the drop command)
> then the drop fails with an informative error message.
>
> That way:
>
> - pg_activate_logical_decoding() is needed only if there is not already a logical
> slot on the primary
> - the drop requires the user to think twice if this is the last logical slot
> - we don't ask the user to create a logical slot if he does not want to use it
> on the primary
>
> Thoughts?

If there is no logical slot on the primary, how can the user disable
logical decoding that has been enabled via
pg_activate_logical_decoding()?

Given the discussion so far, it seems we might want to have a
safeguard to prevent the effective_wal_level from being dropped to
'replica' if the last logical slot is accidentally dropped. Another
idea we can consider is that we automatically increase
effective_wal_level to 'logical' upon the logical slot creation but
don't automatically decrease it when dropping the last slot. To
decrease the effective_wal_level to 'replica', users would need to do
that explicitly for example using a SQL function,
pg_disable_logical_decoding(). We might want to have a GUC parameter
for users to turn on/off this automatic behavior.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



On Fri, Jun 20, 2025 at 12:15 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> Hi,
>
> On Fri, Jun 20, 2025 at 09:48:47AM +0530, shveta malik wrote:
> > On Thu, Jun 19, 2025 at 2:30 PM Bertrand Drouvot
> > <bertranddrouvot.pg@gmail.com> wrote:
> > >
> > > I wonder if a way to address the concerns that we shared above is to use a
> > > mixed approach like:
> > >
> > > - Forget the immediately_reserve idea
> > > - If a user creates a logical slot then we automatically switch to wal_level =
> > > logical (if not already done): I think that's a nice user experience
> > > - *and* provide a new API pg_activate_logical_decoding(), if the user has no
> > > need to create a logical slot on the primary (wants to use the standby to offload
> > > all the logical decoding)
> > >
> > > So if the user also uses a logical slot on the primary (for real..) then there
> > > is no need to launch pg_activate_logical_decoding(), until....:
> > >
> > > The user decides to drop the logical slot on the primary, and then:
> > >
> > > - If the slot is not the last logical slot, that's fine, drop it
> > > - If the slot is the last logical one AND the user did not set a new flag
> > > "wal_level_action" to "say preserve" or "force downgrade" (in the drop command)
> > > then the drop fails with an informative error message.
> >
> > Overall the plan sounds reasonable one.
>
> Thanks for sharing your thoughts!
>
> > But we need to think if the
> > slot is dropped on primary as part of Drop Subscription on subscriber,
> > then how will the user convey the wal-level preserve option?
>
> If the drop subscription attempts to drop the last logical replication slot
> on the primary then it will fail. The "DROP SUBSCRIPTION" doc states:
>
> "
> To proceed in this situation, first disable the subscription by executing
> ALTER SUBSCRIPTION ... DISABLE, and then disassociate it from the replication slot
> by executing ALTER SUBSCRIPTION ... SET (slot_name = NONE).
>
> After that, DROP SUBSCRIPTION will no longer attempt any actions on a remote host.
> Note that if the remote replication slot still exists, it (and any related table
> synchronization slots) should then be dropped manually; otherwise it/they will
> continue to reserve WAL and might eventually cause the disk to fill up.
> "
>
> So one option is to drop the logical replication slot manually (providing a valid
> wal_level_action value).
>
> That's not the most elegant solution but if the error message is clear enough
> (that this is the last logical replication slot and that it has to be
> removed manually) that's "doable" to reach a clean state on the publisher
> and subscriber sides.
>
> > Giving it
> > as part of subscription-cmd to preserve wal-level on primary
>
> Yeah, another option is to make "wal_level_action" part of the "DROP SUBSCRIPTION"
> command. In that case a common scenario would be:
>
> - first drop fails because the wal_level_action value has not been specified
> - then try to drop again but this time specifying a wal_level_action value
>
> > might not be a good idea.
>
> Agree that it sounds kind of weird and I'm not sure that I like the idea of
> giving the "wal level" on the primary control on the subscriber side.
>
> Without it a typical scenario would be:
>
> - drop fails
> - ALTER SUBSCRIPTION <> disable
> - drop the slot on the primary
> - ALTER SUBSCRIPTION <> SET (slot_name = NONE)
> - drop succeeds
>
> and that might not be user friendly but it gives the wal level control on the
> publisher side (and I think that's better).
>

I still feel that to switch wal_level automatically on primary, having
changes in subscription commands/steps might not be a good idea. The
acceptance of this idea could be lesser.

thanks
Shveta



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Bertrand Drouvot
Дата:
Hi,

On Mon, Jun 23, 2025 at 05:10:37PM +0900, Masahiko Sawada wrote:
> On Thu, Jun 19, 2025 at 6:00 PM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> >
> > - pg_activate_logical_decoding() is needed only if there is not already a logical
> > slot on the primary
> > - the drop requires the user to think twice if this is the last logical slot
> > - we don't ask the user to create a logical slot if he does not want to use it
> > on the primary
> >
> > Thoughts?
> 
> If there is no logical slot on the primary, how can the user disable
> logical decoding that has been enabled via
> pg_activate_logical_decoding()?

I was thinking to keep the pg_deactivate_logical_decoding() API proposed
in this thread.

> Given the discussion so far, it seems we might want to have a
> safeguard to prevent the effective_wal_level from being dropped to
> 'replica' if the last logical slot is accidentally dropped. Another
> idea we can consider is that we automatically increase
> effective_wal_level to 'logical' upon the logical slot creation but
> don't automatically decrease it when dropping the last slot. To
> decrease the effective_wal_level to 'replica', users would need to do
> that explicitly for example using a SQL function,
> pg_disable_logical_decoding().

Yeah that could be an idea (and then we don't add the new wal_level_action
to the drop slot command).

> We might want to have a GUC parameter
> for users to turn on/off this automatic behavior.

You mean a GUC to both automaticly set effective_wal_level to logical at slot creation
and also decrease effective_wal_level to replica if last replication slot is dropped?

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



On Mon, Jun 23, 2025 at 1:41 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> Given the discussion so far, it seems we might want to have a
> safeguard to prevent the effective_wal_level from being dropped to
> 'replica' if the last logical slot is accidentally dropped.

Yes, needed for cases where standby or cascaded standbys have
requirements of logical decoding.

> Another
> idea we can consider is that we automatically increase
> effective_wal_level to 'logical' upon the logical slot creation but
> don't automatically decrease it when dropping the last slot. To
> decrease the effective_wal_level to 'replica', users would need to do
> that explicitly for example using a SQL function,
> pg_disable_logical_decoding().

Okay. Seems a good solution so far.

> We might want to have a GUC parameter
> for users to turn on/off this automatic behavior.
>

Yes. Agreed.

thanks
Shveta



On Mon, Jun 23, 2025 at 7:01 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> Hi,
>
> On Mon, Jun 23, 2025 at 05:10:37PM +0900, Masahiko Sawada wrote:
> > On Thu, Jun 19, 2025 at 6:00 PM Bertrand Drouvot
> > <bertranddrouvot.pg@gmail.com> wrote:
> > >
> > > - pg_activate_logical_decoding() is needed only if there is not already a logical
> > > slot on the primary
> > > - the drop requires the user to think twice if this is the last logical slot
> > > - we don't ask the user to create a logical slot if he does not want to use it
> > > on the primary
> > >
> > > Thoughts?
> >
> > If there is no logical slot on the primary, how can the user disable
> > logical decoding that has been enabled via
> > pg_activate_logical_decoding()?
>
> I was thinking to keep the pg_deactivate_logical_decoding() API proposed
> in this thread.

Okay. One approach that combines your idea and Shveta's idea is:

- a special (empty) logical slot with the reserved slot name can be
created and deleted only by SQL functions,
pg_activate_logical_decoding() and pg_deactivate_logical_decoding().
- this special slot cannot be used by logical decoding.
- effective_wal_level is increased and decreased when creating and
dropping a slot (i.e., either a normal logical slots or the special
logical slot).

That way, users who enabled the logical decoding via
pg_activate_logical_decoding() can keep the effective_wal_level being
'logical' since the special slot remains unless executes
pg_deactivate_logical_decoding(). And users who don't use these new
APIs can enable and disable the logical decoding at normal logical
slot creation and deletion.

>
> > Given the discussion so far, it seems we might want to have a
> > safeguard to prevent the effective_wal_level from being dropped to
> > 'replica' if the last logical slot is accidentally dropped. Another
> > idea we can consider is that we automatically increase
> > effective_wal_level to 'logical' upon the logical slot creation but
> > don't automatically decrease it when dropping the last slot. To
> > decrease the effective_wal_level to 'replica', users would need to do
> > that explicitly for example using a SQL function,
> > pg_disable_logical_decoding().
>
> Yeah that could be an idea (and then we don't add the new wal_level_action
> to the drop slot command).
>
> > We might want to have a GUC parameter
> > for users to turn on/off this automatic behavior.
>
> You mean a GUC to both automaticly set effective_wal_level to logical at slot creation
> and also decrease effective_wal_level to replica if last replication slot is dropped?

What I imagined was to control only the decreasing behavior that could
be more problematic than the increase case. But it might be rather
confusing (e.g., what if we turn off that behavior and restart the
server?).

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Bertrand Drouvot
Дата:
Hi,

On Tue, Jun 24, 2025 at 12:13:32AM +0900, Masahiko Sawada wrote:
> On Mon, Jun 23, 2025 at 7:01 PM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> >
> > Hi,
> >
> > On Mon, Jun 23, 2025 at 05:10:37PM +0900, Masahiko Sawada wrote:
> > > On Thu, Jun 19, 2025 at 6:00 PM Bertrand Drouvot
> > > <bertranddrouvot.pg@gmail.com> wrote:
> > > >
> > > > - pg_activate_logical_decoding() is needed only if there is not already a logical
> > > > slot on the primary
> > > > - the drop requires the user to think twice if this is the last logical slot
> > > > - we don't ask the user to create a logical slot if he does not want to use it
> > > > on the primary
> > > >
> > > > Thoughts?
> > >
> > > If there is no logical slot on the primary, how can the user disable
> > > logical decoding that has been enabled via
> > > pg_activate_logical_decoding()?
> >
> > I was thinking to keep the pg_deactivate_logical_decoding() API proposed
> > in this thread.
> 
> Okay. One approach that combines your idea and Shveta's idea is:
> 
> - a special (empty) logical slot with the reserved slot name can be
> created and deleted only by SQL functions,
> pg_activate_logical_decoding() and pg_deactivate_logical_decoding().
> - this special slot cannot be used by logical decoding.
> - effective_wal_level is increased and decreased when creating and
> dropping a slot (i.e., either a normal logical slots or the special
> logical slot).

Yeah, I think that sounds reasonable and that would avoid users to use
the slot created with immediately_reserve set to false by mistake. 

> > You mean a GUC to both automaticly set effective_wal_level to logical at slot creation
> > and also decrease effective_wal_level to replica if last replication slot is dropped?
> 
> What I imagined was to control only the decreasing behavior that could
> be more problematic than the increase case. But it might be rather
> confusing (e.g., what if we turn off that behavior and restart the
> server?).

Right...So not sure we need such a GUC. What about always behave with the 
automatic behavior?

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



On Tue, Jun 24, 2025 at 2:12 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> Hi,
>
> On Tue, Jun 24, 2025 at 12:13:32AM +0900, Masahiko Sawada wrote:
> > On Mon, Jun 23, 2025 at 7:01 PM Bertrand Drouvot
> > <bertranddrouvot.pg@gmail.com> wrote:
> > >
> > > Hi,
> > >
> > > On Mon, Jun 23, 2025 at 05:10:37PM +0900, Masahiko Sawada wrote:
> > > > On Thu, Jun 19, 2025 at 6:00 PM Bertrand Drouvot
> > > > <bertranddrouvot.pg@gmail.com> wrote:
> > > > >
> > > > > - pg_activate_logical_decoding() is needed only if there is not already a logical
> > > > > slot on the primary
> > > > > - the drop requires the user to think twice if this is the last logical slot
> > > > > - we don't ask the user to create a logical slot if he does not want to use it
> > > > > on the primary
> > > > >
> > > > > Thoughts?
> > > >
> > > > If there is no logical slot on the primary, how can the user disable
> > > > logical decoding that has been enabled via
> > > > pg_activate_logical_decoding()?
> > >
> > > I was thinking to keep the pg_deactivate_logical_decoding() API proposed
> > > in this thread.
> >
> > Okay. One approach that combines your idea and Shveta's idea is:
> >
> > - a special (empty) logical slot with the reserved slot name can be
> > created and deleted only by SQL functions,
> > pg_activate_logical_decoding() and pg_deactivate_logical_decoding().
> > - this special slot cannot be used by logical decoding.
> > - effective_wal_level is increased and decreased when creating and
> > dropping a slot (i.e., either a normal logical slots or the special
> > logical slot).
>
> Yeah, I think that sounds reasonable and that would avoid users to use
> the slot created with immediately_reserve set to false by mistake.
>

+1.
I think we do need to provide 'immediately_reserve' as a new argument
now for logical slots creation. If the slot is a special one with a
reserved name, it can internally be created with WALs not reserved for
our purpose.

> > > You mean a GUC to both automaticly set effective_wal_level to logical at slot creation
> > > and also decrease effective_wal_level to replica if last replication slot is dropped?
> >
> > What I imagined was to control only the decreasing behavior that could
> > be more problematic than the increase case. But it might be rather
> > confusing (e.g., what if we turn off that behavior and restart the
> > server?).
>
> Right...So not sure we need such a GUC. What about always behave with the
> automatic behavior?
>

Does it make sense to provide a GUC which will have the default set to
automatic but if the user is not interested or having some issues with
new behaviour, he can switch off the GUC, making the new functions
no-op as well?
In absence of such a GUC, users will have absolutely no way to switch
back to old behaviour. Will that be okay?

thanks
Shveta



On Wed, Jun 25, 2025 at 9:12 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Tue, Jun 24, 2025 at 2:12 PM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> >
> > Hi,
> >
> > On Tue, Jun 24, 2025 at 12:13:32AM +0900, Masahiko Sawada wrote:
> > > On Mon, Jun 23, 2025 at 7:01 PM Bertrand Drouvot
> > > <bertranddrouvot.pg@gmail.com> wrote:
> > > >
> > > > Hi,
> > > >
> > > > On Mon, Jun 23, 2025 at 05:10:37PM +0900, Masahiko Sawada wrote:
> > > > > On Thu, Jun 19, 2025 at 6:00 PM Bertrand Drouvot
> > > > > <bertranddrouvot.pg@gmail.com> wrote:
> > > > > >
> > > > > > - pg_activate_logical_decoding() is needed only if there is not already a logical
> > > > > > slot on the primary
> > > > > > - the drop requires the user to think twice if this is the last logical slot
> > > > > > - we don't ask the user to create a logical slot if he does not want to use it
> > > > > > on the primary
> > > > > >
> > > > > > Thoughts?
> > > > >
> > > > > If there is no logical slot on the primary, how can the user disable
> > > > > logical decoding that has been enabled via
> > > > > pg_activate_logical_decoding()?
> > > >
> > > > I was thinking to keep the pg_deactivate_logical_decoding() API proposed
> > > > in this thread.
> > >
> > > Okay. One approach that combines your idea and Shveta's idea is:
> > >
> > > - a special (empty) logical slot with the reserved slot name can be
> > > created and deleted only by SQL functions,
> > > pg_activate_logical_decoding() and pg_deactivate_logical_decoding().
> > > - this special slot cannot be used by logical decoding.
> > > - effective_wal_level is increased and decreased when creating and
> > > dropping a slot (i.e., either a normal logical slots or the special
> > > logical slot).
> >
> > Yeah, I think that sounds reasonable and that would avoid users to use
> > the slot created with immediately_reserve set to false by mistake.
> >
>
> +1.
> I think we do need to provide 'immediately_reserve' as a new argument
> now for logical slots creation. If the slot is a special one with a
> reserved name, it can internally be created with WALs not reserved for
> our purpose.
>

One correction here.
I think we do NOT need to provide 'immediately_reserve' as a new
argument  now for logical slots creation. ...

> > > > You mean a GUC to both automaticly set effective_wal_level to logical at slot creation
> > > > and also decrease effective_wal_level to replica if last replication slot is dropped?
> > >
> > > What I imagined was to control only the decreasing behavior that could
> > > be more problematic than the increase case. But it might be rather
> > > confusing (e.g., what if we turn off that behavior and restart the
> > > server?).
> >
> > Right...So not sure we need such a GUC. What about always behave with the
> > automatic behavior?
> >
>
> Does it make sense to provide a GUC which will have the default set to
> automatic but if the user is not interested or having some issues with
> new behaviour, he can switch off the GUC, making the new functions
> no-op as well?
> In absence of such a GUC, users will have absolutely no way to switch
> back to old behaviour. Will that be okay?
>
> thanks
> Shveta



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Bertrand Drouvot
Дата:
Hi,

On Wed, Jun 25, 2025 at 09:15:04AM +0530, shveta malik wrote:
> On Wed, Jun 25, 2025 at 9:12 AM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Tue, Jun 24, 2025 at 2:12 PM Bertrand Drouvot
> > <bertranddrouvot.pg@gmail.com> wrote:
> > >
> > > Yeah, I think that sounds reasonable and that would avoid users to use
> > > the slot created with immediately_reserve set to false by mistake.
> > >
> >
> > +1.
> > I think we do need to provide 'immediately_reserve' as a new argument
> > now for logical slots creation. If the slot is a special one with a
> > reserved name, it can internally be created with WALs not reserved for
> > our purpose.
> >
> 
> One correction here.
> I think we do NOT need to provide 'immediately_reserve' as a new
> argument  now for logical slots creation. ...

Agree.

> > > Right...So not sure we need such a GUC. What about always behave with the
> > > automatic behavior?
> > >
> >
> > Does it make sense to provide a GUC which will have the default set to
> > automatic but if the user is not interested or having some issues with
> > new behaviour, he can switch off the GUC, making the new functions
> > no-op as well?
> > In absence of such a GUC, users will have absolutely no way to switch
> > back to old behaviour. Will that be okay?

Since it will be possible to switch back to logical without a restart I do
think that it could make sense to avoid a new GUC. Unless there is a use case
to keep the wal level to logical (outside of the "logical decoding from
standby" context)? 

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



On Wed, Jun 25, 2025 at 12:20 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
>
> Hi,
>
> On Wed, Jun 25, 2025 at 09:15:04AM +0530, shveta malik wrote:
> > On Wed, Jun 25, 2025 at 9:12 AM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > On Tue, Jun 24, 2025 at 2:12 PM Bertrand Drouvot
> > > <bertranddrouvot.pg@gmail.com> wrote:
> > > >
> > > > Yeah, I think that sounds reasonable and that would avoid users to use
> > > > the slot created with immediately_reserve set to false by mistake.
> > > >
> > >
> > > +1.
> > > I think we do need to provide 'immediately_reserve' as a new argument
> > > now for logical slots creation. If the slot is a special one with a
> > > reserved name, it can internally be created with WALs not reserved for
> > > our purpose.
> > >
> >
> > One correction here.
> > I think we do NOT need to provide 'immediately_reserve' as a new
> > argument  now for logical slots creation. ...
>
> Agree.
>
> > > > Right...So not sure we need such a GUC. What about always behave with the
> > > > automatic behavior?
> > > >
> > >
> > > Does it make sense to provide a GUC which will have the default set to
> > > automatic but if the user is not interested or having some issues with
> > > new behaviour, he can switch off the GUC, making the new functions
> > > no-op as well?
> > > In absence of such a GUC, users will have absolutely no way to switch
> > > back to old behaviour. Will that be okay?
>
> Since it will be possible to switch back to logical without a restart I do
> think that it could make sense to avoid a new GUC. Unless there is a use case
> to keep the wal level to logical (outside of the "logical decoding from
> standby" context)?
>

I don’t currently see a specific use case for this, but I’m somewhat
inclined to include the GUC because it can serve as a safety
mechanism. If issues arise with the new behavior, the GUC allows users
to revert to manually controlling wal_level. The GUC would only manage
the automatic aspect of the feature, where slot creation and deletion
internally adjust wal_level. But I don’t have a strong preference and
am open to omitting the GUC if others believe it is unnecessary.

thanks
Shveta



On Thu, Jun 26, 2025 at 5:50 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Wed, Jun 25, 2025 at 12:20 PM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> >
> > Hi,
> >
> > On Wed, Jun 25, 2025 at 09:15:04AM +0530, shveta malik wrote:
> > > On Wed, Jun 25, 2025 at 9:12 AM shveta malik <shveta.malik@gmail.com> wrote:
> > > >
> > > > On Tue, Jun 24, 2025 at 2:12 PM Bertrand Drouvot
> > > > <bertranddrouvot.pg@gmail.com> wrote:
> > > > >
> > > > > Yeah, I think that sounds reasonable and that would avoid users to use
> > > > > the slot created with immediately_reserve set to false by mistake.
> > > > >
> > > >
> > > > +1.
> > > > I think we do need to provide 'immediately_reserve' as a new argument
> > > > now for logical slots creation. If the slot is a special one with a
> > > > reserved name, it can internally be created with WALs not reserved for
> > > > our purpose.
> > > >
> > >
> > > One correction here.
> > > I think we do NOT need to provide 'immediately_reserve' as a new
> > > argument  now for logical slots creation. ...
> >
> > Agree.

Agreed.

> >
> > > > > Right...So not sure we need such a GUC. What about always behave with the
> > > > > automatic behavior?
> > > > >
> > > >
> > > > Does it make sense to provide a GUC which will have the default set to
> > > > automatic but if the user is not interested or having some issues with
> > > > new behaviour, he can switch off the GUC, making the new functions
> > > > no-op as well?
> > > > In absence of such a GUC, users will have absolutely no way to switch
> > > > back to old behaviour. Will that be okay?
> >
> > Since it will be possible to switch back to logical without a restart I do
> > think that it could make sense to avoid a new GUC. Unless there is a use case
> > to keep the wal level to logical (outside of the "logical decoding from
> > standby" context)?
> >
>
> I don’t currently see a specific use case for this, but I’m somewhat
> inclined to include the GUC because it can serve as a safety
> mechanism. If issues arise with the new behavior, the GUC allows users
> to revert to manually controlling wal_level. The GUC would only manage
> the automatic aspect of the feature, where slot creation and deletion
> internally adjust wal_level. But I don’t have a strong preference and
> am open to omitting the GUC if others believe it is unnecessary.

I think the new SQL API to enable the logical decoding would provide a
new way for users who want to enable the logical decoding for standbys
without creating a slot. With that, the user can enable/disable the
logical decoding by calling the SQL function. Also, it's not a
replacement of the current usage (i.e., changing wal_level with
restarting the server). The GUC parameter we're discussing sounds like
a way to serve the current behavior that allows users to
enable/disable the logical decoding only with restarting the server.
I'm not sure if there are users who want to disable the new behavior
and use only the current behavior. I think we can focus on the new API
and automatic behavior at this stage. If we find out there are certain
use cases, we can revisit this idea.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



On Mon, Jun 30, 2025 at 11:16 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Jun 26, 2025 at 5:50 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Wed, Jun 25, 2025 at 12:20 PM Bertrand Drouvot
> > <bertranddrouvot.pg@gmail.com> wrote:
> > >
> > > Hi,
> > >
> > > On Wed, Jun 25, 2025 at 09:15:04AM +0530, shveta malik wrote:
> > > > On Wed, Jun 25, 2025 at 9:12 AM shveta malik <shveta.malik@gmail.com> wrote:
> > > > >
> > > > > On Tue, Jun 24, 2025 at 2:12 PM Bertrand Drouvot
> > > > > <bertranddrouvot.pg@gmail.com> wrote:
> > > > > >
> > > > > > Yeah, I think that sounds reasonable and that would avoid users to use
> > > > > > the slot created with immediately_reserve set to false by mistake.
> > > > > >
> > > > >
> > > > > +1.
> > > > > I think we do need to provide 'immediately_reserve' as a new argument
> > > > > now for logical slots creation. If the slot is a special one with a
> > > > > reserved name, it can internally be created with WALs not reserved for
> > > > > our purpose.
> > > > >
> > > >
> > > > One correction here.
> > > > I think we do NOT need to provide 'immediately_reserve' as a new
> > > > argument  now for logical slots creation. ...
> > >
> > > Agree.
>
> Agreed.
>
> > >
> > > > > > Right...So not sure we need such a GUC. What about always behave with the
> > > > > > automatic behavior?
> > > > > >
> > > > >
> > > > > Does it make sense to provide a GUC which will have the default set to
> > > > > automatic but if the user is not interested or having some issues with
> > > > > new behaviour, he can switch off the GUC, making the new functions
> > > > > no-op as well?
> > > > > In absence of such a GUC, users will have absolutely no way to switch
> > > > > back to old behaviour. Will that be okay?
> > >
> > > Since it will be possible to switch back to logical without a restart I do
> > > think that it could make sense to avoid a new GUC. Unless there is a use case
> > > to keep the wal level to logical (outside of the "logical decoding from
> > > standby" context)?
> > >
> >
> > I don’t currently see a specific use case for this, but I’m somewhat
> > inclined to include the GUC because it can serve as a safety
> > mechanism. If issues arise with the new behavior, the GUC allows users
> > to revert to manually controlling wal_level. The GUC would only manage
> > the automatic aspect of the feature, where slot creation and deletion
> > internally adjust wal_level. But I don’t have a strong preference and
> > am open to omitting the GUC if others believe it is unnecessary.
>
> I think the new SQL API to enable the logical decoding would provide a
> new way for users who want to enable the logical decoding for standbys
> without creating a slot. With that, the user can enable/disable the
> logical decoding by calling the SQL function. Also, it's not a
> replacement of the current usage (i.e., changing wal_level with
> restarting the server). The GUC parameter we're discussing sounds like
> a way to serve the current behavior that allows users to
> enable/disable the logical decoding only with restarting the server.
> I'm not sure if there are users who want to disable the new behavior
> and use only the current behavior. I think we can focus on the new API
> and automatic behavior at this stage. If we find out there are certain
> use cases, we can revisit this idea.
>

Okay, sounds good to me.

thanks
Shveta



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Bertrand Drouvot
Дата:
Hi,

On Mon, Jun 30, 2025 at 12:21:44PM +0530, shveta malik wrote:
> On Mon, Jun 30, 2025 at 11:16 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I'm not sure if there are users who want to disable the new behavior
> > and use only the current behavior. I think we can focus on the new API
> > and automatic behavior at this stage. If we find out there are certain
> > use cases, we can revisit this idea.
> >
> 
> Okay, sounds good to me.

Same here.

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



On Wed, Jun 18, 2025 at 1:07 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Wed, Jun 18, 2025 at 6:06 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > Thank you for the comments!
> >
> > >
> > > 2)
> > > I see that when primary switches back its effective wal_level to
> > > replica while standby has wal_level=logical in conf file, then standby
> > > has this status:
> > >
> > > postgres=# show wal_level;
> > >  wal_level
> > > -----------
> > >  logical
> > >
> > > postgres=# show effective_wal_level;
> > >  effective_wal_level
> > > ---------------------
> > >  replica
> > >
> > > Is this correct? Can effective_wal_level be < wal_level anytime? I
> > > feel it can be greater but never lesser.
> >
> > Hmm, I think we need to define what value we should show in
> > effective_wal_level on standbys because the standbys actually are not
> > writing any WALs and whether or not the logical decoding is enabled on
> > the standbys depends on the primary.
> >
> > In the previous version patch, the standby's effective_wal_level value
> > depended solely on the standby's wal_level value. However, it was
> > confusing in a sense because it's possible that the logical decoding
> > could be available even though effective_wal_level is 'replica' if the
> > primary already enables it. One idea is that given that the logical
> > decoding availability and effective_wal_level value are independent in
> > principle, it's better to provide a SQL function to get the logical
> > decoding status so that users can check the logical decoding
> > availability without checking effective_wal_level. With that function,
> > it might make sense to revert back the behavior to the previous one.
> > That is, on the primary the effective_wal_level value is always
> > greater than or equal to wal_level whereas on the standbys it's always
> > the same as wal_level, and users would be able to check the logical
> > decoding availability using the SQL function. Or it might also be
> > worth considering to show effective_wal_level as NULL on standbys.
>
> Yes, that is one idea. It will resolve the confusion.
> But I was thinking, instead of having one new GUC + a SQL function,
> can we have a GUC alone, which shows logical_decoding status plus the
> cause of that. The new GUC will be applicable on both primary and
> standby. As an example, let's say we name it as
> logical_decoding_status, then it can have these values (
> <status>_<cause>):
>
> enabled_wal_level_logical:                                  valid both
> for primary, standby
> enabled_effective_wal_level_logical:                   valid only for primary
> enabled_cascaded_logical_decoding                   valid only for standby
> disabled :
>   valid both for primary, standby
>
> 'enabled_cascaded_logical_decoding'  will indicate that logical
> decoding is enabled on standby (even when its own wal_level=replica)
> as a cascaded effect from primary. It can be possible either due to
> primary's wal_level=logical or logical slot being present on primary.

I'm not sure it's a good idea to combine two values into one GUC
because the tools would have to parse the string in order to know when
they want to know either information.

As for the effective_wal_level shown on the standby, if it shows the
effective WAL level it might make sense to show as 'replica' even if
the standby's wal_level is 'logical' because the standby cannot write
any WAL and need to follow the primary. While it might be worth
considering to accept the case of effective_wal_level (replica) <
wal_level (logical) only on the standbys, we need to keep the
principle that the logical decoding is available only when
effective_wal_level = 'logical'.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



On Wed, Jul 2, 2025 at 9:46 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Jun 18, 2025 at 1:07 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Wed, Jun 18, 2025 at 6:06 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > Thank you for the comments!
> > >
> > > >
> > > > 2)
> > > > I see that when primary switches back its effective wal_level to
> > > > replica while standby has wal_level=logical in conf file, then standby
> > > > has this status:
> > > >
> > > > postgres=# show wal_level;
> > > >  wal_level
> > > > -----------
> > > >  logical
> > > >
> > > > postgres=# show effective_wal_level;
> > > >  effective_wal_level
> > > > ---------------------
> > > >  replica
> > > >
> > > > Is this correct? Can effective_wal_level be < wal_level anytime? I
> > > > feel it can be greater but never lesser.
> > >
> > > Hmm, I think we need to define what value we should show in
> > > effective_wal_level on standbys because the standbys actually are not
> > > writing any WALs and whether or not the logical decoding is enabled on
> > > the standbys depends on the primary.
> > >
> > > In the previous version patch, the standby's effective_wal_level value
> > > depended solely on the standby's wal_level value. However, it was
> > > confusing in a sense because it's possible that the logical decoding
> > > could be available even though effective_wal_level is 'replica' if the
> > > primary already enables it. One idea is that given that the logical
> > > decoding availability and effective_wal_level value are independent in
> > > principle, it's better to provide a SQL function to get the logical
> > > decoding status so that users can check the logical decoding
> > > availability without checking effective_wal_level. With that function,
> > > it might make sense to revert back the behavior to the previous one.
> > > That is, on the primary the effective_wal_level value is always
> > > greater than or equal to wal_level whereas on the standbys it's always
> > > the same as wal_level, and users would be able to check the logical
> > > decoding availability using the SQL function. Or it might also be
> > > worth considering to show effective_wal_level as NULL on standbys.
> >
> > Yes, that is one idea. It will resolve the confusion.
> > But I was thinking, instead of having one new GUC + a SQL function,
> > can we have a GUC alone, which shows logical_decoding status plus the
> > cause of that. The new GUC will be applicable on both primary and
> > standby. As an example, let's say we name it as
> > logical_decoding_status, then it can have these values (
> > <status>_<cause>):
> >
> > enabled_wal_level_logical:                                  valid both
> > for primary, standby
> > enabled_effective_wal_level_logical:                   valid only for primary
> > enabled_cascaded_logical_decoding                   valid only for standby
> > disabled :
> >   valid both for primary, standby
> >
> > 'enabled_cascaded_logical_decoding'  will indicate that logical
> > decoding is enabled on standby (even when its own wal_level=replica)
> > as a cascaded effect from primary. It can be possible either due to
> > primary's wal_level=logical or logical slot being present on primary.
>
> I'm not sure it's a good idea to combine two values into one GUC
> because the tools would have to parse the string in order to know when
> they want to know either information.

Okay. Agreed.

> As for the effective_wal_level shown on the standby, if it shows the
> effective WAL level it might make sense to show as 'replica' even if
> the standby's wal_level is 'logical'

Alright. It depends on the definition we choose to assign to
effective_wal_level.

> because the standby cannot write
> any WAL and need to follow the primary.

When the standby’s wal_level is set to 'logical', the requirement for
logical decoding is already fulfilled. Or do you mean that the
effective_wal_level on standby should not be shown as logical until
both the primary and standby have wal_level set to logical and we also
have a logical slot present on standby?

> While it might be worth
> considering to accept the case of effective_wal_level (replica) <
> wal_level (logical) only on the standbys, we need to keep the
> principle that the logical decoding is available only when
> effective_wal_level = 'logical'.
>

Back to the previous question, when will the effective_wal_level be
displayed as 'logical' on standby? Which criterias need to be met?

thanks
Shveta



On Thu, Jul 3, 2025 at 3:32 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Wed, Jul 2, 2025 at 9:46 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Wed, Jun 18, 2025 at 1:07 PM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > On Wed, Jun 18, 2025 at 6:06 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > Thank you for the comments!
> > > >
> > > > >
> > > > > 2)
> > > > > I see that when primary switches back its effective wal_level to
> > > > > replica while standby has wal_level=logical in conf file, then standby
> > > > > has this status:
> > > > >
> > > > > postgres=# show wal_level;
> > > > >  wal_level
> > > > > -----------
> > > > >  logical
> > > > >
> > > > > postgres=# show effective_wal_level;
> > > > >  effective_wal_level
> > > > > ---------------------
> > > > >  replica
> > > > >
> > > > > Is this correct? Can effective_wal_level be < wal_level anytime? I
> > > > > feel it can be greater but never lesser.
> > > >
> > > > Hmm, I think we need to define what value we should show in
> > > > effective_wal_level on standbys because the standbys actually are not
> > > > writing any WALs and whether or not the logical decoding is enabled on
> > > > the standbys depends on the primary.
> > > >
> > > > In the previous version patch, the standby's effective_wal_level value
> > > > depended solely on the standby's wal_level value. However, it was
> > > > confusing in a sense because it's possible that the logical decoding
> > > > could be available even though effective_wal_level is 'replica' if the
> > > > primary already enables it. One idea is that given that the logical
> > > > decoding availability and effective_wal_level value are independent in
> > > > principle, it's better to provide a SQL function to get the logical
> > > > decoding status so that users can check the logical decoding
> > > > availability without checking effective_wal_level. With that function,
> > > > it might make sense to revert back the behavior to the previous one.
> > > > That is, on the primary the effective_wal_level value is always
> > > > greater than or equal to wal_level whereas on the standbys it's always
> > > > the same as wal_level, and users would be able to check the logical
> > > > decoding availability using the SQL function. Or it might also be
> > > > worth considering to show effective_wal_level as NULL on standbys.
> > >
> > > Yes, that is one idea. It will resolve the confusion.
> > > But I was thinking, instead of having one new GUC + a SQL function,
> > > can we have a GUC alone, which shows logical_decoding status plus the
> > > cause of that. The new GUC will be applicable on both primary and
> > > standby. As an example, let's say we name it as
> > > logical_decoding_status, then it can have these values (
> > > <status>_<cause>):
> > >
> > > enabled_wal_level_logical:                                  valid both
> > > for primary, standby
> > > enabled_effective_wal_level_logical:                   valid only for primary
> > > enabled_cascaded_logical_decoding                   valid only for standby
> > > disabled :
> > >   valid both for primary, standby
> > >
> > > 'enabled_cascaded_logical_decoding'  will indicate that logical
> > > decoding is enabled on standby (even when its own wal_level=replica)
> > > as a cascaded effect from primary. It can be possible either due to
> > > primary's wal_level=logical or logical slot being present on primary.
> >
> > I'm not sure it's a good idea to combine two values into one GUC
> > because the tools would have to parse the string in order to know when
> > they want to know either information.
>
> Okay. Agreed.
>
> > As for the effective_wal_level shown on the standby, if it shows the
> > effective WAL level it might make sense to show as 'replica' even if
> > the standby's wal_level is 'logical'
>
> Alright. It depends on the definition we choose to assign to
> effective_wal_level.
>
> > because the standby cannot write
> > any WAL and need to follow the primary.
>
> When the standby’s wal_level is set to 'logical', the requirement for
> logical decoding is already fulfilled. Or do you mean that the
> effective_wal_level on standby should not be shown as logical until
> both the primary and standby have wal_level set to logical and we also
> have a logical slot present on standby?

Even when the standby's wal_level is 'logical', the logical decoding
cannot be used on the standby if the primary doesn't enable it. IOW,
even if the standby's wal_level is 'replica', the logical decoding is
available on the standby if the primary enables it.

>
> > While it might be worth
> > considering to accept the case of effective_wal_level (replica) <
> > wal_level (logical) only on the standbys, we need to keep the
> > principle that the logical decoding is available only when
> > effective_wal_level = 'logical'.
> >
>
> Back to the previous question, when will the effective_wal_level be
> displayed as 'logical' on standby? Which criterias need to be met?

The criteria is whether the primary enables the logical decoding or
not. If the primary enables the logical decoding by either setting
wal_level='logical' or creating a logical slot, the
effective_wal_level on the standby will be displayed as 'logical'.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



On Sun, Jul 6, 2025 at 8:33 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Jul 3, 2025 at 3:32 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Wed, Jul 2, 2025 at 9:46 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Wed, Jun 18, 2025 at 1:07 PM shveta malik <shveta.malik@gmail.com> wrote:
> > > >
> > > > On Wed, Jun 18, 2025 at 6:06 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > >
> > > > > Thank you for the comments!
> > > > >
> > > > > >
> > > > > > 2)
> > > > > > I see that when primary switches back its effective wal_level to
> > > > > > replica while standby has wal_level=logical in conf file, then standby
> > > > > > has this status:
> > > > > >
> > > > > > postgres=# show wal_level;
> > > > > >  wal_level
> > > > > > -----------
> > > > > >  logical
> > > > > >
> > > > > > postgres=# show effective_wal_level;
> > > > > >  effective_wal_level
> > > > > > ---------------------
> > > > > >  replica
> > > > > >
> > > > > > Is this correct? Can effective_wal_level be < wal_level anytime? I
> > > > > > feel it can be greater but never lesser.
> > > > >
> > > > > Hmm, I think we need to define what value we should show in
> > > > > effective_wal_level on standbys because the standbys actually are not
> > > > > writing any WALs and whether or not the logical decoding is enabled on
> > > > > the standbys depends on the primary.
> > > > >
> > > > > In the previous version patch, the standby's effective_wal_level value
> > > > > depended solely on the standby's wal_level value. However, it was
> > > > > confusing in a sense because it's possible that the logical decoding
> > > > > could be available even though effective_wal_level is 'replica' if the
> > > > > primary already enables it. One idea is that given that the logical
> > > > > decoding availability and effective_wal_level value are independent in
> > > > > principle, it's better to provide a SQL function to get the logical
> > > > > decoding status so that users can check the logical decoding
> > > > > availability without checking effective_wal_level. With that function,
> > > > > it might make sense to revert back the behavior to the previous one.
> > > > > That is, on the primary the effective_wal_level value is always
> > > > > greater than or equal to wal_level whereas on the standbys it's always
> > > > > the same as wal_level, and users would be able to check the logical
> > > > > decoding availability using the SQL function. Or it might also be
> > > > > worth considering to show effective_wal_level as NULL on standbys.
> > > >
> > > > Yes, that is one idea. It will resolve the confusion.
> > > > But I was thinking, instead of having one new GUC + a SQL function,
> > > > can we have a GUC alone, which shows logical_decoding status plus the
> > > > cause of that. The new GUC will be applicable on both primary and
> > > > standby. As an example, let's say we name it as
> > > > logical_decoding_status, then it can have these values (
> > > > <status>_<cause>):
> > > >
> > > > enabled_wal_level_logical:                                  valid both
> > > > for primary, standby
> > > > enabled_effective_wal_level_logical:                   valid only for primary
> > > > enabled_cascaded_logical_decoding                   valid only for standby
> > > > disabled :
> > > >   valid both for primary, standby
> > > >
> > > > 'enabled_cascaded_logical_decoding'  will indicate that logical
> > > > decoding is enabled on standby (even when its own wal_level=replica)
> > > > as a cascaded effect from primary. It can be possible either due to
> > > > primary's wal_level=logical or logical slot being present on primary.
> > >
> > > I'm not sure it's a good idea to combine two values into one GUC
> > > because the tools would have to parse the string in order to know when
> > > they want to know either information.
> >
> > Okay. Agreed.
> >
> > > As for the effective_wal_level shown on the standby, if it shows the
> > > effective WAL level it might make sense to show as 'replica' even if
> > > the standby's wal_level is 'logical'
> >
> > Alright. It depends on the definition we choose to assign to
> > effective_wal_level.
> >
> > > because the standby cannot write
> > > any WAL and need to follow the primary.
> >
> > When the standby’s wal_level is set to 'logical', the requirement for
> > logical decoding is already fulfilled. Or do you mean that the
> > effective_wal_level on standby should not be shown as logical until
> > both the primary and standby have wal_level set to logical and we also
> > have a logical slot present on standby?
>
> Even when the standby's wal_level is 'logical', the logical decoding
> cannot be used on the standby if the primary doesn't enable it. IOW,
> even if the standby's wal_level is 'replica', the logical decoding is
> available on the standby if the primary enables it.
>
> >
> > > While it might be worth
> > > considering to accept the case of effective_wal_level (replica) <
> > > wal_level (logical) only on the standbys, we need to keep the
> > > principle that the logical decoding is available only when
> > > effective_wal_level = 'logical'.
> > >
> >
> > Back to the previous question, when will the effective_wal_level be
> > displayed as 'logical' on standby? Which criterias need to be met?
>
> The criteria is whether the primary enables the logical decoding or
> not. If the primary enables the logical decoding by either setting
> wal_level='logical' or creating a logical slot, the
> effective_wal_level on the standby will be displayed as 'logical'.
>

Okay, I understand it now. Thanks for explaining. The proposed
behaviour looks reasonable.

thanks
Shveta



On Tue, Jul 8, 2025 at 7:41 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Sun, Jul 6, 2025 at 8:33 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Thu, Jul 3, 2025 at 3:32 PM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > On Wed, Jul 2, 2025 at 9:46 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Wed, Jun 18, 2025 at 1:07 PM shveta malik <shveta.malik@gmail.com> wrote:
> > > > >
> > > > > On Wed, Jun 18, 2025 at 6:06 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > > >
> > > > > > Thank you for the comments!
> > > > > >
> > > > > > >
> > > > > > > 2)
> > > > > > > I see that when primary switches back its effective wal_level to
> > > > > > > replica while standby has wal_level=logical in conf file, then standby
> > > > > > > has this status:
> > > > > > >
> > > > > > > postgres=# show wal_level;
> > > > > > >  wal_level
> > > > > > > -----------
> > > > > > >  logical
> > > > > > >
> > > > > > > postgres=# show effective_wal_level;
> > > > > > >  effective_wal_level
> > > > > > > ---------------------
> > > > > > >  replica
> > > > > > >
> > > > > > > Is this correct? Can effective_wal_level be < wal_level anytime? I
> > > > > > > feel it can be greater but never lesser.
> > > > > >
> > > > > > Hmm, I think we need to define what value we should show in
> > > > > > effective_wal_level on standbys because the standbys actually are not
> > > > > > writing any WALs and whether or not the logical decoding is enabled on
> > > > > > the standbys depends on the primary.
> > > > > >
> > > > > > In the previous version patch, the standby's effective_wal_level value
> > > > > > depended solely on the standby's wal_level value. However, it was
> > > > > > confusing in a sense because it's possible that the logical decoding
> > > > > > could be available even though effective_wal_level is 'replica' if the
> > > > > > primary already enables it. One idea is that given that the logical
> > > > > > decoding availability and effective_wal_level value are independent in
> > > > > > principle, it's better to provide a SQL function to get the logical
> > > > > > decoding status so that users can check the logical decoding
> > > > > > availability without checking effective_wal_level. With that function,
> > > > > > it might make sense to revert back the behavior to the previous one.
> > > > > > That is, on the primary the effective_wal_level value is always
> > > > > > greater than or equal to wal_level whereas on the standbys it's always
> > > > > > the same as wal_level, and users would be able to check the logical
> > > > > > decoding availability using the SQL function. Or it might also be
> > > > > > worth considering to show effective_wal_level as NULL on standbys.
> > > > >
> > > > > Yes, that is one idea. It will resolve the confusion.
> > > > > But I was thinking, instead of having one new GUC + a SQL function,
> > > > > can we have a GUC alone, which shows logical_decoding status plus the
> > > > > cause of that. The new GUC will be applicable on both primary and
> > > > > standby. As an example, let's say we name it as
> > > > > logical_decoding_status, then it can have these values (
> > > > > <status>_<cause>):
> > > > >
> > > > > enabled_wal_level_logical:                                  valid both
> > > > > for primary, standby
> > > > > enabled_effective_wal_level_logical:                   valid only for primary
> > > > > enabled_cascaded_logical_decoding                   valid only for standby
> > > > > disabled :
> > > > >   valid both for primary, standby
> > > > >
> > > > > 'enabled_cascaded_logical_decoding'  will indicate that logical
> > > > > decoding is enabled on standby (even when its own wal_level=replica)
> > > > > as a cascaded effect from primary. It can be possible either due to
> > > > > primary's wal_level=logical or logical slot being present on primary.
> > > >
> > > > I'm not sure it's a good idea to combine two values into one GUC
> > > > because the tools would have to parse the string in order to know when
> > > > they want to know either information.
> > >
> > > Okay. Agreed.
> > >
> > > > As for the effective_wal_level shown on the standby, if it shows the
> > > > effective WAL level it might make sense to show as 'replica' even if
> > > > the standby's wal_level is 'logical'
> > >
> > > Alright. It depends on the definition we choose to assign to
> > > effective_wal_level.
> > >
> > > > because the standby cannot write
> > > > any WAL and need to follow the primary.
> > >
> > > When the standby’s wal_level is set to 'logical', the requirement for
> > > logical decoding is already fulfilled. Or do you mean that the
> > > effective_wal_level on standby should not be shown as logical until
> > > both the primary and standby have wal_level set to logical and we also
> > > have a logical slot present on standby?
> >
> > Even when the standby's wal_level is 'logical', the logical decoding
> > cannot be used on the standby if the primary doesn't enable it. IOW,
> > even if the standby's wal_level is 'replica', the logical decoding is
> > available on the standby if the primary enables it.
> >
> > >
> > > > While it might be worth
> > > > considering to accept the case of effective_wal_level (replica) <
> > > > wal_level (logical) only on the standbys, we need to keep the
> > > > principle that the logical decoding is available only when
> > > > effective_wal_level = 'logical'.
> > > >
> > >
> > > Back to the previous question, when will the effective_wal_level be
> > > displayed as 'logical' on standby? Which criterias need to be met?
> >
> > The criteria is whether the primary enables the logical decoding or
> > not. If the primary enables the logical decoding by either setting
> > wal_level='logical' or creating a logical slot, the
> > effective_wal_level on the standby will be displayed as 'logical'.
> >
>
> Okay, I understand it now. Thanks for explaining. The proposed
> behaviour looks reasonable.

I've attached updated patches that implement the idea we've discussed.
The patches still need to be polished but the implemented ideas seem
good. Feedback is very welcome.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Вложения
On Tue, Jul 15, 2025 at 10:37 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> I've attached updated patches that implement the idea we've discussed.
> The patches still need to be polished but the implemented ideas seem
> good. Feedback is very welcome.
>

Thank You for the patches. I just tried my hands on ptach001 yet, few concerns:

1)
+ else if (xlrec.wal_level == WAL_LEVEL_REPLICA &&
+ pg_atomic_read_u32(&ReplicationSlotCtl->n_inuse_logical_slots) == 0)
+ {
+ /*
+ * Disable the logical decoding if there is no in-use logical slot
+ * on the standby.
+ */
+ UpdateLogicalDecodingStatus(false);
+ }

Due to above logic, the change in wal_level to replica on primary may
end up disabling logical decoding on standby, even if logical decoding
is still enabled on primary due to existence of slot.

Steps:
a) Create a slot on primary, but no slots on standby.
b) Switch wal_level to logical on primary by doing a restart.
c) Now switch wal_level back to replica on primary. This will end up
disabling logical decoding on standby and slot creation will fail on
standby as well.

2)
In the same code, why don't we invalidate slots as we do when we
receive XLOG_LOGICAL_DECODING_STATUS_CHANGE?

3)
+ EnsureLogicalDecodingEnabled();

I do not understand the usage of above in synchronize_one_slot().
Since 'EnsureLogicalDecodingEnabled' is a no-op for standby, it will
do nothing here.

4)
- if (wal_level < WAL_LEVEL_LOGICAL)
- ereport(ERROR,
- errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("replication slot synchronization requires \"wal_level\" >=
\"logical\""));
+ if (!IsLogicalDecodingEnabled())
+ ereport(elevel,
+ errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+

Is the change from 'ERROR' to  'elevel' intentional? With this change,
slotsync worker will keep running even if logical decoding is not
enabled on standby (or primary) yet.


thanks
Shveta



On Wed, Jun 18, 2025 at 3:23 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Wed, Jun 18, 2025 at 2:39 PM Bertrand Drouvot
> <bertranddrouvot.pg@gmail.com> wrote:
> >
> > Hi,
> >
> > On Tue, Jun 10, 2025 at 02:00:55PM -0700, Masahiko Sawada wrote:
> > > > > > > > 0001 patch allows us to create a logical slot without WAL reservation.
> >
> > Thanks for the patch and sorry to be late in this conversation.
> >
> > The thing that worry me a bit with this is that that could be easy to attempt
> > to use the slot "by mistake" and then (as a consequence) trigger WAL reservation
> > by mistake on the primary. I think that this mistake is more likely to happen
> > with a logical slot as compared to a physical slot.
> >
>
> Yes, agreed. Another concern is the possibility of someone
> intentionally using it and associating it with a subscription. If the
> subscription is later dropped, it could also cause the slot to be
> removed.
>

IIUC, your main concern here is that the last slot on
primary/publisher could be dropped unintentionally by the user leading
to invalidating the logical slots on any physical-standby connected
with publisher, right?

What if we make DROP SUBSCRIPTION fail if it can lead to removal of
the last slot on publisher and allow DROP to succeed when the
subscription's drop_slot_force (a new subscription option) is set?
Now, users can still be allowed to Drop the subscription, if it
disassociates the subscription from the slot by using method explained
in docs [1] (See Notes section). Similarly when a user is trying to
drop the last logical slot via pg_drop_replication_slot, we will allow
it only with the force option. This should ensure that the user is
aware of the consequences of dropping the last slot.

--
With Regards,
Amit Kapila.



On Thu, Jul 17, 2025 at 3:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Jun 18, 2025 at 3:23 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Wed, Jun 18, 2025 at 2:39 PM Bertrand Drouvot
> > <bertranddrouvot.pg@gmail.com> wrote:
> > >
> > > Hi,
> > >
> > > On Tue, Jun 10, 2025 at 02:00:55PM -0700, Masahiko Sawada wrote:
> > > > > > > > > 0001 patch allows us to create a logical slot without WAL reservation.
> > >
> > > Thanks for the patch and sorry to be late in this conversation.
> > >
> > > The thing that worry me a bit with this is that that could be easy to attempt
> > > to use the slot "by mistake" and then (as a consequence) trigger WAL reservation
> > > by mistake on the primary. I think that this mistake is more likely to happen
> > > with a logical slot as compared to a physical slot.
> > >
> >
> > Yes, agreed. Another concern is the possibility of someone
> > intentionally using it and associating it with a subscription. If the
> > subscription is later dropped, it could also cause the slot to be
> > removed.
> >
>
> IIUC, your main concern here is that the last slot on
> primary/publisher could be dropped unintentionally by the user leading
> to invalidating the logical slots on any physical-standby connected
> with publisher, right?
>
> What if we make DROP SUBSCRIPTION fail if it can lead to removal of
> the last slot on publisher and allow DROP to succeed when the
> subscription's drop_slot_force (a new subscription option) is set?
> Now, users can still be allowed to Drop the subscription, if it
> disassociates the subscription from the slot by using method explained
> in docs [1] (See Notes section).
>

Forgot to specify link. Done now [1].

[1] - https://www.postgresql.org/docs/devel/sql-dropsubscription.html

--
With Regards,
Amit Kapila.



On Tue, Jul 15, 2025 at 10:55 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Tue, Jul 15, 2025 at 10:37 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I've attached updated patches that implement the idea we've discussed.
> > The patches still need to be polished but the implemented ideas seem
> > good. Feedback is very welcome.
> >
>
> Thank You for the patches. I just tried my hands on ptach001 yet, few concerns:
>
> 1)
> + else if (xlrec.wal_level == WAL_LEVEL_REPLICA &&
> + pg_atomic_read_u32(&ReplicationSlotCtl->n_inuse_logical_slots) == 0)
> + {
> + /*
> + * Disable the logical decoding if there is no in-use logical slot
> + * on the standby.
> + */
> + UpdateLogicalDecodingStatus(false);
> + }
>
> Due to above logic, the change in wal_level to replica on primary may
> end up disabling logical decoding on standby, even if logical decoding
> is still enabled on primary due to existence of slot.
>
> Steps:
> a) Create a slot on primary, but no slots on standby.
> b) Switch wal_level to logical on primary by doing a restart.
> c) Now switch wal_level back to replica on primary. This will end up
> disabling logical decoding on standby and slot creation will fail on
> standby as well.
>
> 2)
> In the same code, why don't we invalidate slots as we do when we
> receive XLOG_LOGICAL_DECODING_STATUS_CHANGE?

Good catch. I think decreasing wal_level to 'replica' should not
directly involve logical decoding status.

Related to this issue, I've considered the possibility of getting rid
of 'logical' from wal_level. Given the effective WAL level is
increased and decreased automatically upon the slot creation and
deletion, I think we would be able to get rid of 'logical' from
wal_level. One scenario where users would need to take additional
action is that users offload logical replication to the standby
server. In this case, the user would have to enable the logical
decoding on the primary server before creating a logical slot on the
standby. If such additional work is acceptable, we can remove it, and
I think it would be reasonable.


> 3)
> + EnsureLogicalDecodingEnabled();
>
> I do not understand the usage of above in synchronize_one_slot().
> Since 'EnsureLogicalDecodingEnabled' is a no-op for standby, it will
> do nothing here.

Right, will remove it.

>
> 4)
> - if (wal_level < WAL_LEVEL_LOGICAL)
> - ereport(ERROR,
> - errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> - errmsg("replication slot synchronization requires \"wal_level\" >=
> \"logical\""));
> + if (!IsLogicalDecodingEnabled())
> + ereport(elevel,
> + errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> +
>
> Is the change from 'ERROR' to  'elevel' intentional? With this change,
> slotsync worker will keep running even if logical decoding is not
> enabled on standby (or primary) yet.

Yes, this is because this function is called by the postmaster. But I
can see your point so I will deal with it in the next version patch. I
think the slotsync worker needs to exit when the logical decoding gets
disabled.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



On Thu, Jul 17, 2025 at 2:36 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> What if we make DROP SUBSCRIPTION fail if it can lead to removal of
> the last slot on publisher and allow DROP to succeed when the
> subscription's drop_slot_force (a new subscription option) is set?
> Now, users can still be allowed to Drop the subscription, if it
> disassociates the subscription from the slot by using method explained
> in docs [1] (See Notes section). Similarly when a user is trying to
> drop the last logical slot via pg_drop_replication_slot, we will allow
> it only with the force option. This should ensure that the user is
> aware of the consequences of dropping the last slot.

I think even if we prevent the last logical replication that was used
for logical replication from being removed, the primary would continue
to accumulate WALs for that slot. I think it actually doesn't need to
hold WALs and dead catalog tuples in order to keep the effective WAL
level 'logical'. So probably it's better to 'reset' the slot, meaning
to clear the last slot's restart_lsn and catalog_xmin? Using such
option would work in interactive case but I'm not sure how works in
tools like shell scripts.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



On Fri, Jul 18, 2025 at 12:26 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Jul 17, 2025 at 2:36 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > What if we make DROP SUBSCRIPTION fail if it can lead to removal of
> > the last slot on publisher and allow DROP to succeed when the
> > subscription's drop_slot_force (a new subscription option) is set?
> > Now, users can still be allowed to Drop the subscription, if it
> > disassociates the subscription from the slot by using method explained
> > in docs [1] (See Notes section). Similarly when a user is trying to
> > drop the last logical slot via pg_drop_replication_slot, we will allow
> > it only with the force option. This should ensure that the user is
> > aware of the consequences of dropping the last slot.
>
> I think even if we prevent the last logical replication that was used
> for logical replication from being removed, the primary would continue
> to accumulate WALs for that slot. I think it actually doesn't need to
> hold WALs and dead catalog tuples in order to keep the effective WAL
> level 'logical'. So probably it's better to 'reset' the slot, meaning
> to clear the last slot's restart_lsn and catalog_xmin? Using such
> option would work in interactive case but I'm not sure how works in
> tools like shell scripts.
>

We can reset the slot's properties like catalog_xmin or restart_lsn at
the time of DROP SUBSCRIPTION even if it is the last logical slot on
publisher. We can probably use ALTER_REPLICATION_SLOT to do it. If we
want to do this, the steps during DROP SUBSCRIPTION would be to check
if the slot to be dropped is the last logical slot, if so, then we
will simply reset some of its elements, otherwise drop it. If we can't
drop it, then we can give a WARNING/NOTICE to the user. If a user has
set a new option like force_drop_slot then we will drop the slot
irrespective of whether it is a last slot or not. We can additionally
check wal_level on publisher as well before deciding whether to drop
the slot or not. If the wal_level is logical then we can probably drop
the slot.

--
With Regards,
Amit Kapila.



On Thu, Jul 17, 2025 at 3:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Jun 18, 2025 at 3:23 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Wed, Jun 18, 2025 at 2:39 PM Bertrand Drouvot
> > <bertranddrouvot.pg@gmail.com> wrote:
> > >
> > > Hi,
> > >
> > > On Tue, Jun 10, 2025 at 02:00:55PM -0700, Masahiko Sawada wrote:
> > > > > > > > > 0001 patch allows us to create a logical slot without WAL reservation.
> > >
> > > Thanks for the patch and sorry to be late in this conversation.
> > >
> > > The thing that worry me a bit with this is that that could be easy to attempt
> > > to use the slot "by mistake" and then (as a consequence) trigger WAL reservation
> > > by mistake on the primary. I think that this mistake is more likely to happen
> > > with a logical slot as compared to a physical slot.
> > >
> >
> > Yes, agreed. Another concern is the possibility of someone
> > intentionally using it and associating it with a subscription. If the
> > subscription is later dropped, it could also cause the slot to be
> > removed.
> >
>
> IIUC, your main concern here is that the last slot on
> primary/publisher could be dropped unintentionally by the user leading
> to invalidating the logical slots on any physical-standby connected
> with publisher, right?
>

Right.

> What if we make DROP SUBSCRIPTION fail if it can lead to removal of
> the last slot on publisher and allow DROP to succeed when the
> subscription's drop_slot_force (a new subscription option) is set?
> Now, users can still be allowed to Drop the subscription, if it
> disassociates the subscription from the slot by using method explained
> in docs [1] (See Notes section). Similarly when a user is trying to
> drop the last logical slot via pg_drop_replication_slot, we will allow
> it only with the force option. This should ensure that the user is
> aware of the consequences of dropping the last slot.
>

One concern I have is regarding the default setting of
'force_slot_drop' . I assume the default value of this new DROP-SUB
argument will be 'false' to prevent customers from inadvertently
dropping the last slot on the publisher. But, would this be
acceptable, considering that users may have DROP-SUBSCRIPTION commands
in their scripts which would suddenly stop dropping slot now? OTOH, if
we keep  the default as 'true', users might overlook changing it to
'false' when dropping the last subscription, and thus not solving any
purpose.

thanks
Shveta



On Fri, Jul 18, 2025 at 2:27 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Thu, Jul 17, 2025 at 3:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Wed, Jun 18, 2025 at 3:23 PM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > On Wed, Jun 18, 2025 at 2:39 PM Bertrand Drouvot
> > > <bertranddrouvot.pg@gmail.com> wrote:
> > > >
> > > > Hi,
> > > >
> > > > On Tue, Jun 10, 2025 at 02:00:55PM -0700, Masahiko Sawada wrote:
> > > > > > > > > > 0001 patch allows us to create a logical slot without WAL reservation.
> > > >
> > > > Thanks for the patch and sorry to be late in this conversation.
> > > >
> > > > The thing that worry me a bit with this is that that could be easy to attempt
> > > > to use the slot "by mistake" and then (as a consequence) trigger WAL reservation
> > > > by mistake on the primary. I think that this mistake is more likely to happen
> > > > with a logical slot as compared to a physical slot.
> > > >
> > >
> > > Yes, agreed. Another concern is the possibility of someone
> > > intentionally using it and associating it with a subscription. If the
> > > subscription is later dropped, it could also cause the slot to be
> > > removed.
> > >
> >
> > IIUC, your main concern here is that the last slot on
> > primary/publisher could be dropped unintentionally by the user leading
> > to invalidating the logical slots on any physical-standby connected
> > with publisher, right?
> >
>
> Right.
>
> > What if we make DROP SUBSCRIPTION fail if it can lead to removal of
> > the last slot on publisher and allow DROP to succeed when the
> > subscription's drop_slot_force (a new subscription option) is set?
> > Now, users can still be allowed to Drop the subscription, if it
> > disassociates the subscription from the slot by using method explained
> > in docs [1] (See Notes section). Similarly when a user is trying to
> > drop the last logical slot via pg_drop_replication_slot, we will allow
> > it only with the force option. This should ensure that the user is
> > aware of the consequences of dropping the last slot.
> >
>
> One concern I have is regarding the default setting of
> 'force_slot_drop' . I assume the default value of this new DROP-SUB
> argument will be 'false' to prevent customers from inadvertently
> dropping the last slot on the publisher. But, would this be
> acceptable, considering that users may have DROP-SUBSCRIPTION commands
> in their scripts which would suddenly stop dropping slot now?
>

That would only happen when users use this new idea of enabling
wal_level to 'logical' on the fly. I think the users having existing
setups with pub-sub would have kept the default wal_level to 'logical'
on publisher.

--
With Regards,
Amit Kapila.



On Fri, Jul 18, 2025 at 3:03 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Jul 18, 2025 at 2:27 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Thu, Jul 17, 2025 at 3:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Wed, Jun 18, 2025 at 3:23 PM shveta malik <shveta.malik@gmail.com> wrote:
> > > >
> > > > On Wed, Jun 18, 2025 at 2:39 PM Bertrand Drouvot
> > > > <bertranddrouvot.pg@gmail.com> wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > On Tue, Jun 10, 2025 at 02:00:55PM -0700, Masahiko Sawada wrote:
> > > > > > > > > > > 0001 patch allows us to create a logical slot without WAL reservation.
> > > > >
> > > > > Thanks for the patch and sorry to be late in this conversation.
> > > > >
> > > > > The thing that worry me a bit with this is that that could be easy to attempt
> > > > > to use the slot "by mistake" and then (as a consequence) trigger WAL reservation
> > > > > by mistake on the primary. I think that this mistake is more likely to happen
> > > > > with a logical slot as compared to a physical slot.
> > > > >
> > > >
> > > > Yes, agreed. Another concern is the possibility of someone
> > > > intentionally using it and associating it with a subscription. If the
> > > > subscription is later dropped, it could also cause the slot to be
> > > > removed.
> > > >
> > >
> > > IIUC, your main concern here is that the last slot on
> > > primary/publisher could be dropped unintentionally by the user leading
> > > to invalidating the logical slots on any physical-standby connected
> > > with publisher, right?
> > >
> >
> > Right.
> >
> > > What if we make DROP SUBSCRIPTION fail if it can lead to removal of
> > > the last slot on publisher and allow DROP to succeed when the
> > > subscription's drop_slot_force (a new subscription option) is set?
> > > Now, users can still be allowed to Drop the subscription, if it
> > > disassociates the subscription from the slot by using method explained
> > > in docs [1] (See Notes section). Similarly when a user is trying to
> > > drop the last logical slot via pg_drop_replication_slot, we will allow
> > > it only with the force option. This should ensure that the user is
> > > aware of the consequences of dropping the last slot.
> > >
> >
> > One concern I have is regarding the default setting of
> > 'force_slot_drop' . I assume the default value of this new DROP-SUB
> > argument will be 'false' to prevent customers from inadvertently
> > dropping the last slot on the publisher. But, would this be
> > acceptable, considering that users may have DROP-SUBSCRIPTION commands
> > in their scripts which would suddenly stop dropping slot now?
> >
>
> That would only happen when users use this new idea of enabling
> wal_level to 'logical' on the fly. I think the users having existing
> setups with pub-sub would have kept the default wal_level to 'logical'
> on publisher.
>

Okay, but then we will have to avoid doing the enhancement of getting
rid of wal_level='logical' as suggested in [1].

Even if we do so, I am not very much convinced for this argument and its value.
--The value of ''force_slot_drop" will hold its meaning only in a
conditional scenario. Assuming default is false,  then it will still
drop the slots until it is last slot and wal_level < logical on
primary. This behavior can seem a bit unintuitive or confusing from
the user's perspective.
--If the user is trying to actually retain the slot by giving
force_slot_drop=false , then how are we going to track that i.e.
distinguish from its default.

Bertrand has proposed a similar design in [2]. We can revisit that as well once.

I'm continuing to think it through and will share any further thoughts
if something comes to mind.

[1]: https://www.postgresql.org/message-id/CAD21AoD5aONyxZHGG5-gQhQnAMuF9dByLn0%2BtreF8cRT06bqkA%40mail.gmail.com
[2]: https://www.postgresql.org/message-id/aFPRqXR41xOhE597%40ip-10-97-1-34.eu-west-3.compute.internal

thanks
Shveta



On Mon, Jul 21, 2025 at 10:48 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Fri, Jul 18, 2025 at 3:03 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, Jul 18, 2025 at 2:27 PM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > On Thu, Jul 17, 2025 at 3:06 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > On Wed, Jun 18, 2025 at 3:23 PM shveta malik <shveta.malik@gmail.com> wrote:
> > > > >
> > > > > On Wed, Jun 18, 2025 at 2:39 PM Bertrand Drouvot
> > > > > <bertranddrouvot.pg@gmail.com> wrote:
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > On Tue, Jun 10, 2025 at 02:00:55PM -0700, Masahiko Sawada wrote:
> > > > > > > > > > > > 0001 patch allows us to create a logical slot without WAL reservation.
> > > > > >
> > > > > > Thanks for the patch and sorry to be late in this conversation.
> > > > > >
> > > > > > The thing that worry me a bit with this is that that could be easy to attempt
> > > > > > to use the slot "by mistake" and then (as a consequence) trigger WAL reservation
> > > > > > by mistake on the primary. I think that this mistake is more likely to happen
> > > > > > with a logical slot as compared to a physical slot.
> > > > > >
> > > > >
> > > > > Yes, agreed. Another concern is the possibility of someone
> > > > > intentionally using it and associating it with a subscription. If the
> > > > > subscription is later dropped, it could also cause the slot to be
> > > > > removed.
> > > > >
> > > >
> > > > IIUC, your main concern here is that the last slot on
> > > > primary/publisher could be dropped unintentionally by the user leading
> > > > to invalidating the logical slots on any physical-standby connected
> > > > with publisher, right?
> > > >
> > >
> > > Right.
> > >
> > > > What if we make DROP SUBSCRIPTION fail if it can lead to removal of
> > > > the last slot on publisher and allow DROP to succeed when the
> > > > subscription's drop_slot_force (a new subscription option) is set?
> > > > Now, users can still be allowed to Drop the subscription, if it
> > > > disassociates the subscription from the slot by using method explained
> > > > in docs [1] (See Notes section). Similarly when a user is trying to
> > > > drop the last logical slot via pg_drop_replication_slot, we will allow
> > > > it only with the force option. This should ensure that the user is
> > > > aware of the consequences of dropping the last slot.
> > > >
> > >
> > > One concern I have is regarding the default setting of
> > > 'force_slot_drop' . I assume the default value of this new DROP-SUB
> > > argument will be 'false' to prevent customers from inadvertently
> > > dropping the last slot on the publisher. But, would this be
> > > acceptable, considering that users may have DROP-SUBSCRIPTION commands
> > > in their scripts which would suddenly stop dropping slot now?
> > >
> >
> > That would only happen when users use this new idea of enabling
> > wal_level to 'logical' on the fly. I think the users having existing
> > setups with pub-sub would have kept the default wal_level to 'logical'
> > on publisher.
> >
>
> Okay, but then we will have to avoid doing the enhancement of getting
> rid of wal_level='logical' as suggested in [1].
>
> Even if we do so, I am not very much convinced for this argument and its value.
> --The value of ''force_slot_drop" will hold its meaning only in a
> conditional scenario. Assuming default is false,  then it will still
> drop the slots until it is last slot and wal_level < logical on
> primary. This behavior can seem a bit unintuitive or confusing from
> the user's perspective.
> --If the user is trying to actually retain the slot by giving
> force_slot_drop=false , then how are we going to track that i.e.
> distinguish from its default.
>
> Bertrand has proposed a similar design in [2]. We can revisit that as well once.
>
> I'm continuing to think it through and will share any further thoughts
> if something comes to mind.
>

How about a parameter named 'on_last_logical_slot' with possible
values: 'error', 'warn', 'drop', 'retain'?
Alternatively, we could use 'last_logical_slot_drop_policy' with
values: 'error', 'warn', 'allow'.

These parameters could be supported by both DROP SUBSCRIPTION and
pg_drop_replication_slot(), or alternatively implemented as a GUC on
the primary server. The default value should be either 'warn' or,
preferably, 'error' for safer behavior.

It seems more logical to me for this to be a GUC on the primary since
it falls within the primary’s scope.

With this, when the user attempts to drop a subscription associated
with the last slot on the primary, and the GUC is set to 'error', then
the DROP SUBSCRIPTION command should fail with a message like:

ERROR: cannot drop last logical slot; logical decoding would be
disabled on primary.
HINT:  Disassociate subscription from the slot by executing ALTER
SUBSCRIPTION ... SET (slot_name = NONE)

And if the user tries to do pg_drop_replication_slot() on primary and
if it is the last logical slot, then error should be:

ERROR: cannot drop last logical slot; logical decoding would be disabled
HINT: Set last_logical_slot_drop_policy= 'allow' to override.

Thoughts?

thanks
Shveta



On Mon, Jul 21, 2025 at 10:48 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Fri, Jul 18, 2025 at 3:03 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > >
> > > One concern I have is regarding the default setting of
> > > 'force_slot_drop' . I assume the default value of this new DROP-SUB
> > > argument will be 'false' to prevent customers from inadvertently
> > > dropping the last slot on the publisher. But, would this be
> > > acceptable, considering that users may have DROP-SUBSCRIPTION commands
> > > in their scripts which would suddenly stop dropping slot now?
> > >
> >
> > That would only happen when users use this new idea of enabling
> > wal_level to 'logical' on the fly. I think the users having existing
> > setups with pub-sub would have kept the default wal_level to 'logical'
> > on publisher.
> >
>
> Okay, but then we will have to avoid doing the enhancement of getting
> rid of wal_level='logical' as suggested in [1].
>
> Even if we do so, I am not very much convinced for this argument and its value.
> --The value of ''force_slot_drop" will hold its meaning only in a
> conditional scenario. Assuming default is false,  then it will still
> drop the slots until it is last slot and wal_level < logical on
> primary. This behavior can seem a bit unintuitive or confusing from
> the user's perspective.
> --If the user is trying to actually retain the slot by giving
> force_slot_drop=false , then how are we going to track that i.e.
> distinguish from its default.
>
> Bertrand has proposed a similar design in [2]. We can revisit that as well once.
>

I am slightly hesitant to introduce multiple ways to enable logical
decoding/replication unless that is the only path as giving multiple
options to achieve the same thing can confuse users as to which one is
preferable and pros/cons of each.

--
With Regards,
Amit Kapila.



On Mon, Jul 21, 2025 at 11:24 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Mon, Jul 21, 2025 at 10:48 AM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > I'm continuing to think it through and will share any further thoughts
> > if something comes to mind.
> >
>
> How about a parameter named 'on_last_logical_slot' with possible
> values: 'error', 'warn', 'drop', 'retain'?
> Alternatively, we could use 'last_logical_slot_drop_policy' with
> values: 'error', 'warn', 'allow'.
>
> These parameters could be supported by both DROP SUBSCRIPTION and
> pg_drop_replication_slot(), or alternatively implemented as a GUC on
> the primary server. The default value should be either 'warn' or,
> preferably, 'error' for safer behavior.
>
> It seems more logical to me for this to be a GUC on the primary since
> it falls within the primary’s scope.
>

This is worth considering. OTOH, it is possible that we are over
worried about users accidentally dropping the slot required for
continuing the logical decoding on the physical standby. In the
publisher-subscriber model, it seems quite intuitive that as soon as
the first subscription is created, we enable logical decoding on the
primary and when the last subscription is dropped, the logical
decoding on the publisher gets disabled. The case we are worrying
about is, for users, that enable logical decoding/replication on
physical standby based on the presence of a logical slot on the
primary. I think if we have documented clearly it is the
responsibility of users that they need to either (a) keep wal_level as
logical on primary, or (b) preserve a slot on the primary, it should
be sufficient. There could be multiple ways to preserve the slot, one
is users always create a special slot on the primary for this purpose
or we can provide a slot_option which users can specify/alter so that
they get ERROR/WARNING on the last such slot being dropped. I feel we
should choose the simplest option and rely on users to use the feature
appropriately. We can always enhance the feature in future versions
based on feedback from the field.

--
With Regards,
Amit Kapila.



On Mon, Jul 21, 2025 at 12:23 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> I am slightly hesitant to introduce multiple ways to enable logical
> decoding/replication unless that is the only path as giving multiple
> options to achieve the same thing can confuse users as to which one is
> preferable and pros/cons of each.

Okay I understand your concern and it is a valid one.

> >
> > How about a parameter named 'on_last_logical_slot' with possible
> > values: 'error', 'warn', 'drop', 'retain'?
> > Alternatively, we could use 'last_logical_slot_drop_policy' with
> > values: 'error', 'warn', 'allow'.
> >
> > These parameters could be supported by both DROP SUBSCRIPTION and
> > pg_drop_replication_slot(), or alternatively implemented as a GUC on
> > the primary server. The default value should be either 'warn' or,
> > preferably, 'error' for safer behavior.
> >
> > It seems more logical to me for this to be a GUC on the primary since
> > it falls within the primary’s scope.
> >
>
> This is worth considering. OTOH, it is possible that we are over
> worried about users accidentally dropping the slot required for
> continuing the logical decoding on the physical standby. In the
> publisher-subscriber model, it seems quite intuitive that as soon as
> the first subscription is created, we enable logical decoding on the
> primary and when the last subscription is dropped, the logical
> decoding on the publisher gets disabled. The case we are worrying
> about is, for users, that enable logical decoding/replication on
> physical standby based on the presence of a logical slot on the
> primary. I think if we have documented clearly it is the
> responsibility of users that they need to either (a) keep wal_level as
> logical on primary, or (b) preserve a slot on the primary, it should
> be sufficient. There could be multiple ways to preserve the slot, one
> is users always create a special slot on the primary for this purpose
> or we can provide a slot_option which users can specify/alter so that
> they get ERROR/WARNING on the last such slot being dropped. I feel we
> should choose the simplest option and rely on users to use the feature
> appropriately. We can always enhance the feature in future versions
> based on feedback from the field.
>

I feel introducing a GUC is the simplest approach, as it provides
users with some control over the behavior when handling the last
logical slot. With this safeguard in place, we can be more confident
about eventually removing wal_level = logical, either now or in the
future.

That said, if we decide it's acceptable to proceed without this
additional ERROR/WARNING mechanism, I'm fine with that as well. But it
does leave users with a small risk of unintentionally disabling
logical decoding, even with proper documentation. Let's see what
others think here.

thanks
Shveta



On Sun, Jul 20, 2025 at 11:53 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Jul 21, 2025 at 11:24 AM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Mon, Jul 21, 2025 at 10:48 AM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > I'm continuing to think it through and will share any further thoughts
> > > if something comes to mind.
> > >
> >
> > How about a parameter named 'on_last_logical_slot' with possible
> > values: 'error', 'warn', 'drop', 'retain'?
> > Alternatively, we could use 'last_logical_slot_drop_policy' with
> > values: 'error', 'warn', 'allow'.
> >
> > These parameters could be supported by both DROP SUBSCRIPTION and
> > pg_drop_replication_slot(), or alternatively implemented as a GUC on
> > the primary server. The default value should be either 'warn' or,
> > preferably, 'error' for safer behavior.
> >
> > It seems more logical to me for this to be a GUC on the primary since
> > it falls within the primary’s scope.
> >
>
> This is worth considering. OTOH, it is possible that we are over
> worried about users accidentally dropping the slot required for
> continuing the logical decoding on the physical standby. In the
> publisher-subscriber model, it seems quite intuitive that as soon as
> the first subscription is created, we enable logical decoding on the
> primary and when the last subscription is dropped, the logical
> decoding on the publisher gets disabled. The case we are worrying
> about is, for users, that enable logical decoding/replication on
> physical standby based on the presence of a logical slot on the
> primary. I think if we have documented clearly it is the
> responsibility of users that they need to either (a) keep wal_level as
> logical on primary, or (b) preserve a slot on the primary, it should
> be sufficient. There could be multiple ways to preserve the slot, one
> is users always create a special slot on the primary for this purpose
> or we can provide a slot_option which users can specify/alter so that
> they get ERROR/WARNING on the last such slot being dropped. I feel we
> should choose the simplest option and rely on users to use the feature
> appropriately. We can always enhance the feature in future versions
> based on feedback from the field.

Yes, I agree. The main patch focuses on the part where we
automatically change the effective WAL level upon the logical slot
creation and deletion (and potentially remove 'logical' from
wal_level), and other things are implemented as additional features in
a separate patch. In the case where users are using logical decoding
only on the standbys, we might want to have a concept like empty
logical slots as we have discussed because users would not want to let
a logical slot on the primary preserve anything. But it's a separate
discussion whether we provide a way to protect such a slot from being
dropped or used mistakenly.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



On Tue, Jul 22, 2025 at 5:03 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> Yes, I agree. The main patch focuses on the part where we
> automatically change the effective WAL level upon the logical slot
> creation and deletion (and potentially remove 'logical' from
> wal_level), and other things are implemented as additional features in
> a separate patch.

 I am keeping my focus on patch001 until we decide further on how to
protect the slot.  Apart from few comments in [1], please find one
more concern:

There is a race condition between creating and dropping a replication
slot when enabling or disabling logical decoding. We might end up with
logical decoding disabled even when a logical slot is present.

Steps:
1) Set wal_level=replica on primary.
2) Create logical_slot1 which will enable logical decoding, causing
effective_wal_level to become logical.
3) Drop logical_slot1 and pause execution inside
DisableLogicalDecodingIfNecessary() right after the
'n_inuse_logical_slots' check using a debugger.
4) In another session, create logical_slot2. It will attempt to enable
logical-decoding but since it is already enabled,
EnsureLogicalDecodingEnabled() will be a no-op.
5) Release debugger of drop-slot, it will disable logical decoding.

Ultimately, logical_slot2is present while logical decoding is disabled
and thus we see this:

postgres=# select slot_name from pg_replication_slots;
   slot_name
---------------
 logical_slot2

postgres=# show effective_wal_level;
 effective_wal_level
---------------------
 replica
(1 row)

postgres=# select pg_logical_slot_get_changes('logical_slot2', NULL,
NULL, 'proto_version', '4', 'publication_names', 'pub');
ERROR:  logical decoding is not enabled
HINT:  Set "wal_level" >= "logical" or create at least one logical slot.

Shall we acquire LogicalDecodingControlLock in exclusive mode at a
little earlier stage? Currently we acquire it after
IsLogicalDecodingEnabled() check. I think we shall acquire it before
this check in in both enable and disable flow?

[1]: https://www.postgresql.org/message-id/CAJpy0uC0e%3DJ7L4q9RnQ3pbSAtvWy40r9qp3tr41zoogHQmDO8g%40mail.gmail.com

thanks
Shveta



On Tue, Jul 22, 2025 at 11:44 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Tue, Jul 22, 2025 at 5:03 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > Yes, I agree. The main patch focuses on the part where we
> > automatically change the effective WAL level upon the logical slot
> > creation and deletion (and potentially remove 'logical' from
> > wal_level), and other things are implemented as additional features in
> > a separate patch.
>
>  I am keeping my focus on patch001 until we decide further on how to
> protect the slot.

Yeah, I also dropped the additional feature patch from the patch set for now.

> Apart from few comments in [1], please find one
> more concern:
>
> There is a race condition between creating and dropping a replication
> slot when enabling or disabling logical decoding. We might end up with
> logical decoding disabled even when a logical slot is present.
>
> Steps:
> 1) Set wal_level=replica on primary.
> 2) Create logical_slot1 which will enable logical decoding, causing
> effective_wal_level to become logical.
> 3) Drop logical_slot1 and pause execution inside
> DisableLogicalDecodingIfNecessary() right after the
> 'n_inuse_logical_slots' check using a debugger.
> 4) In another session, create logical_slot2. It will attempt to enable
> logical-decoding but since it is already enabled,
> EnsureLogicalDecodingEnabled() will be a no-op.
> 5) Release debugger of drop-slot, it will disable logical decoding.
>
> Ultimately, logical_slot2is present while logical decoding is disabled
> and thus we see this:
>
> postgres=# select slot_name from pg_replication_slots;
>    slot_name
> ---------------
>  logical_slot2
>
> postgres=# show effective_wal_level;
>  effective_wal_level
> ---------------------
>  replica
> (1 row)
>
> postgres=# select pg_logical_slot_get_changes('logical_slot2', NULL,
> NULL, 'proto_version', '4', 'publication_names', 'pub');
> ERROR:  logical decoding is not enabled
> HINT:  Set "wal_level" >= "logical" or create at least one logical slot.
>
> Shall we acquire LogicalDecodingControlLock in exclusive mode at a
> little earlier stage? Currently we acquire it after
> IsLogicalDecodingEnabled() check. I think we shall acquire it before
> this check in in both enable and disable flow?

Thank you for testing the patch!

I've reworked the locking part in the patch. The attached v4 patch
should address all review comments including your previous
comments[1].

Regards,

[1] https://www.postgresql.org/message-id/CAJpy0uC0e%3DJ7L4q9RnQ3pbSAtvWy40r9qp3tr41zoogHQmDO8g%40mail.gmail.com

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Вложения

RE: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
"Hayato Kuroda (Fujitsu)"
Дата:
Dear Sawada-san,

> Thank you for testing the patch!
> 
> I've reworked the locking part in the patch. The attached v4 patch
> should address all review comments including your previous
> comments[1].

Thanks for making the patch! I resumed to spending time for the project.
Here are my comments.

1.
Just in case - can you modify xlogdesc.c based on your fix?

2.
Currently pg_upgrade has below checking:
```
    if (nslots_on_old > 0 && strcmp(wal_level, "logical") != 0)
        pg_fatal("\"wal_level\" must be \"logical\" but is set to \"%s\"",
                 wal_level);
```

But this can be relaxed because wal_level can be adjusted appropriately. IIUC it
is enough to be higher than "minimal". Is it right?

3.
Currently pg_createsubscriber has below checking:
```
    if (strcmp(wal_level, "logical") != 0)
    {
        pg_log_error("publisher requires \"wal_level\" >= \"logical\"");
        failed = true;
    }
```

I feel the checking is completely not needed, because pg_createsubscriber needs
a streaming standby and wal_level = minimal cannot be set with this node placement.
Thought?

4.
We should update PG_CONTROL_VERSION and pg_controldata as well.

5.
I'm wondering how pg_resetwal handles. Since all the replication slot cannot be
used after the command, logicalDecodingEnabled can be set to false, right?

Best regards,
Hayato Kuroda
FUJITSU LIMITED


On Mon, Jul 28, 2025 at 5:34 AM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Dear Sawada-san,
>
> > Thank you for testing the patch!
> >
> > I've reworked the locking part in the patch. The attached v4 patch
> > should address all review comments including your previous
> > comments[1].
>
> Thanks for making the patch! I resumed to spending time for the project.

Thank you for reviewing the patch!

> Here are my comments.
>
> 1.
> Just in case - can you modify xlogdesc.c based on your fix?

Will fix.

>
> 2.
> Currently pg_upgrade has below checking:
> ```
>         if (nslots_on_old > 0 && strcmp(wal_level, "logical") != 0)
>                 pg_fatal("\"wal_level\" must be \"logical\" but is set to \"%s\"",
>                                  wal_level);
> ```
>
> But this can be relaxed because wal_level can be adjusted appropriately. IIUC it
> is enough to be higher than "minimal". Is it right?

Right, will fix.

>
> 3.
> Currently pg_createsubscriber has below checking:
> ```
>         if (strcmp(wal_level, "logical") != 0)
>         {
>                 pg_log_error("publisher requires \"wal_level\" >= \"logical\"");
>                 failed = true;
>         }
> ```
>
> I feel the checking is completely not needed, because pg_createsubscriber needs
> a streaming standby and wal_level = minimal cannot be set with this node placement.
> Thought?

Yes, we can get rid of this check.

>
> 4.
> We should update PG_CONTROL_VERSION and pg_controldata as well.

Right, I'll update pg_controldata. For PG_CONTROL_VERSION, I'm going
to update before the push.

>
> 5.
> I'm wondering how pg_resetwal handles. Since all the replication slot cannot be
> used after the command, logicalDecodingEnabled can be set to false, right?

I think that logical decoding remains enabled as long as logical slots
are present. For example, it remains enabled even if the sole logical
slot is invalidated.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



On Fri, 25 Jul 2025 at 11:45, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, Jul 22, 2025 at 11:44 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Tue, Jul 22, 2025 at 5:03 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > Yes, I agree. The main patch focuses on the part where we
> > > automatically change the effective WAL level upon the logical slot
> > > creation and deletion (and potentially remove 'logical' from
> > > wal_level), and other things are implemented as additional features in
> > > a separate patch.
> >
> >  I am keeping my focus on patch001 until we decide further on how to
> > protect the slot.
>
> Yeah, I also dropped the additional feature patch from the patch set for now.
>
> > Apart from few comments in [1], please find one
> > more concern:
> >
> > There is a race condition between creating and dropping a replication
> > slot when enabling or disabling logical decoding. We might end up with
> > logical decoding disabled even when a logical slot is present.
> >
> > Steps:
> > 1) Set wal_level=replica on primary.
> > 2) Create logical_slot1 which will enable logical decoding, causing
> > effective_wal_level to become logical.
> > 3) Drop logical_slot1 and pause execution inside
> > DisableLogicalDecodingIfNecessary() right after the
> > 'n_inuse_logical_slots' check using a debugger.
> > 4) In another session, create logical_slot2. It will attempt to enable
> > logical-decoding but since it is already enabled,
> > EnsureLogicalDecodingEnabled() will be a no-op.
> > 5) Release debugger of drop-slot, it will disable logical decoding.
> >
> > Ultimately, logical_slot2is present while logical decoding is disabled
> > and thus we see this:
> >
> > postgres=# select slot_name from pg_replication_slots;
> >    slot_name
> > ---------------
> >  logical_slot2
> >
> > postgres=# show effective_wal_level;
> >  effective_wal_level
> > ---------------------
> >  replica
> > (1 row)
> >
> > postgres=# select pg_logical_slot_get_changes('logical_slot2', NULL,
> > NULL, 'proto_version', '4', 'publication_names', 'pub');
> > ERROR:  logical decoding is not enabled
> > HINT:  Set "wal_level" >= "logical" or create at least one logical slot.
> >
> > Shall we acquire LogicalDecodingControlLock in exclusive mode at a
> > little earlier stage? Currently we acquire it after
> > IsLogicalDecodingEnabled() check. I think we shall acquire it before
> > this check in in both enable and disable flow?
>
> Thank you for testing the patch!
>
> I've reworked the locking part in the patch. The attached v4 patch
> should address all review comments including your previous
> comments[1].

Few comments:
1) pg_waldump not handled for the new WAL record added
XLOG_LOGICAL_DECODING_STATUS_CHANGE:
+               XLogRegisterData(&logical_decoding, sizeof(bool));
+               recptr = XLogInsert(RM_XLOG_ID,
XLOG_LOGICAL_DECODING_STATUS_CHANGE);
+               XLogFlush(recptr);

rmgr: XLOG        len (rec/tot):     54/    54, tx:          0, lsn:
0/017633D8, prev 0/01763360, desc: PARAMETER_CHANGE
max_connections=100 max_worker_processes=8 max_wal_senders=10
max_prepared_xacts=10 max_locks_per_xact=64 wal_level=replica
wal_log_hints=off track_commit_timestamp=off
rmgr: XLOG        len (rec/tot):     27/    27, tx:          0, lsn:
0/01763410, prev 0/017633D8, desc: UNKNOWN (f0)
rmgr: Standby     len (rec/tot):     50/    50, tx:          0, lsn:
0/01763430, prev 0/01763410, desc: RUNNING_XACTS nextXid 754
latestCompletedXid 753 oldestRunningXid 754

2) Similarly pg_walinspect also should be handled for the new WAL record added:
postgres=# SELECT * FROM pg_get_wal_records_info('0/017633D8', '0/01763468');
 start_lsn  |  end_lsn   |  prev_lsn  | xid | resource_manager |
record_type    | record_length | main_data_length | fpi_length |
                    description
                                        | block_ref

------------+------------+------------+-----+------------------+------------------+---------------+------------------+------------+-----------------------------------------------------------

---------------------------------------------------------------------------------------------------------------+-----------
 0/017633D8 | 0/01763410 | 0/01763360 |   0 | XLOG             |
PARAMETER_CHANGE |            54 |               28 |          0 |
max_connections=100 max_worker_processes=8 max_wal_senders
=10 max_prepared_xacts=10 max_locks_per_xact=64 wal_level=replica
wal_log_hints=off track_commit_timestamp=off |
 0/01763410 | 0/01763430 | 0/017633D8 |   0 | XLOG             |
UNKNOWN (f0)     |            27 |                1 |          0 |

                                        |
 0/01763430 | 0/01763468 | 0/01763410 |   0 | Standby          |
RUNNING_XACTS    |            50 |               24 |          0 |
nextXid 754 latestCompletedXid 753 oldestRunningXid 754

                                        |
(3 rows)

3) Should this be the other way around? Would it be better to throw
the error earlier, instead of waiting for the running transactions to
finish?
@@ -136,6 +137,9 @@ create_logical_replication_slot(char *name, char *plugin,
                                                  temporary ?
RS_TEMPORARY : RS_EPHEMERAL, two_phase,
                                                  failover, false);

+       EnsureLogicalDecodingEnabled();
+       CheckLogicalDecodingRequirements();

4) The includes xlog_internal, xlogutils, atomics, lwlock, procsignal,
shmem, standby and guc is not required, I was able to compile without
it:
+ *       src/backend/replication/logical/logicalctl.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "access/xlog_internal.h"
+#include "access/xlogutils.h"
+#include "access/xloginsert.h"
+#include "catalog/pg_control.h"
+#include "port/atomics.h"
+#include "miscadmin.h"
+#include "storage/lwlock.h"
+#include "storage/procarray.h"
+#include "storage/procsignal.h"
+#include "storage/ipc.h"
+#include "storage/lmgr.h"
+#include "storage/shmem.h"
+#include "storage/standby.h"
+#include "replication/logicalctl.h"
+#include "replication/slot.h"
+#include "utils/guc.h"
+#include "utils/wait_event_types.h"

5) I felt this change is not related to this patch:
@@ -1144,7 +1152,7 @@ slotsync_reread_config(void)
        if (old_sync_replication_slots != sync_replication_slots)
        {
                ereport(LOG,
-               /* translator: %s is a GUC variable name */
+                               /* translator: %s is a GUC variable name */
                                errmsg("replication slot
synchronization worker will shut down because \"%s\" is disabled",
"sync_replication_slots"));
                proc_exit(0);

6) Can we include the high level design in the commit message and also
the other possible designs that were considered before finalizing on
this, it will help new reviewers to get a head start as the thread is
a long thread.

7) I did not see documentation added, can we add the required
documentation for this.

8) The new test file added should be included in meson.build file

Regards,
Vignesh



On Fri, Jul 25, 2025 at 11:45 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> Thank you for testing the patch!
>
> I've reworked the locking part in the patch. The attached v4 patch
> should address all review comments including your previous
> comments[1].
>

Thank You for the patch. I have not reviewed fully, but please find
few comments:

1)
CreateReplicationSlot():

  Assert(cmd->kind == REPLICATION_KIND_LOGICAL);
 + EnsureLogicalDecodingEnabled();
  CheckLogicalDecodingRequirements();
  ReplicationSlotCreate(...);

We may have another race-condition here. We have
EnsureLogicalDecodingEnabled() before ReplicationSlotCreate(). It
means we are enabling logical-decoding before incrementing
LogicalDecodingCtl->n_logical_slots. So before we increment
LogicalDecodingCtl->n_logical_slots through ReplicationSlotCreate(),
another session may try to meanwhile drop the logical slot (another
one and last one), and thus it may end up disabling logical-decoding
as it will find n_logical_slots as 0.

Steps:
a) Create logical slot logical_slot1 on primary.
b) Create  publication pub1.
c) During Create-sub on subscriber, stop walsender after
EnsureLogicalDecodingEnabled() by attaching debugger.
d) Drop logical_slot1 on primary.
e) Release the walsender debugger.


2)
create_logical_replication_slot:

ReplicationSlotCreate(name, true
....
+ EnsureLogicalDecodingEnabled();
+ CheckLogicalDecodingRequirements();

Earlier we had CheckLogicalDecodingRequirements() before we actually
created the slot. Now we had it after slot-creation. It makes sense to
do Logical-Decoding related checks post EnsureLogicalDecodingEnabled,
but 'CheckSlotRequirements' should be done prior to slot-creation.
Otherwise we will end up creating the slot and later dropping it when
it should not have been created in the first place (for say wal_level
< replica).


3)
+ EnsureLogicalDecodingEnabled();
+

We can get rid of this from slotsync as this is no-op on standby


4)
pg_sync_replication_slots()
        if (!RecoveryInProgress())
                ereport(ERROR,

errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
                                errmsg("replication slots can only be
synchronized to a standby server"));

+ EnsureLogicalDecodingEnabled();

This API is called on standby alone, so EnsureLogicalDecodingEnabled
is not needed here either.

thanks
Shveta



RE: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
"Hayato Kuroda (Fujitsu)"
Дата:
Dear Sawada-san,

While reading more, I found a race condition. In this case the effective_wal_level
can be logical even when there is no logical slot.
UpdateLogicalDecodingStatusEndOfRecovery() checks the number of slots of the logical
slot then release the lock once. Then startup process acquires the lock once and
compare with IsLogicalDecodingEnabled(), then update the status afterward if needed.
So, wal_level can be inconsistent if the status is changed after the n_logical_slots
is read.

Steps:
a) constructed a primary-standby system
b) createad a logical slot on the primary
c) createad a logical slot on the standby
d) sent a promote signal to standby
e) dropped a logical slot on standby, just after startup process released
   LogicalDecodingControlLock in UpdateLogicalDecodingStatusEndOfRecovery().

After the above, effective_wal_level was keep turning on. Is it the expected behavior?
```
postgres=# SELECT slot_name FROM pg_replication_slots ;
 slot_name 
-----------
(0 rows)

postgres=# show effective_wal_level ;
 effective_wal_level 
---------------------
 logical
(1 row)

postgres=# SELECT pg_is_in_recovery ();
 pg_is_in_recovery 
-------------------
 f
(1 row)
```

Best regards,
Hayato Kuroda
FUJITSU LIMITED


On Wed, Jul 30, 2025 at 12:22 AM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Dear Sawada-san,
>
> While reading more, I found a race condition.

Thank you for reviewing the patch!

> In this case the effective_wal_level
> can be logical even when there is no logical slot.
> UpdateLogicalDecodingStatusEndOfRecovery() checks the number of slots of the logical
> slot then release the lock once. Then startup process acquires the lock once and
> compare with IsLogicalDecodingEnabled(), then update the status afterward if needed.
> So, wal_level can be inconsistent if the status is changed after the n_logical_slots
> is read.
>
> Steps:
> a) constructed a primary-standby system
> b) createad a logical slot on the primary
> c) createad a logical slot on the standby
> d) sent a promote signal to standby
> e) dropped a logical slot on standby, just after startup process released
>    LogicalDecodingControlLock in UpdateLogicalDecodingStatusEndOfRecovery().
>
> After the above, effective_wal_level was keep turning on. Is it the expected behavior?

No, we need to fix it.

I thought we could fix this issue by checking the number of in-use
logical slots while holding ReplicationSlotControlLock and
LogicalDecodingControlLock, but it seems we need to deal with another
race condition too between  backends and startup processes at the end
of recovery.

Currently the backend skips controlling logical decoding status if the
server is in recovery (by checking RecoveryInProgress()), but it's
possible that a backend process tries to drop a logical slot after the
startup process calling UpdateLogicalDecodingStatusEndOfRecovery() and
before accepting writes. In this case, the backend ends up not
disabling logical decoding and it remains enabled. I think we would
somehow need to delay the logical decoding status change in this
period until the recovery completes.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



RE: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
"Hayato Kuroda (Fujitsu)"
Дата:
Dear Sawada-san,

> I thought we could fix this issue by checking the number of in-use
> logical slots while holding ReplicationSlotControlLock and
> LogicalDecodingControlLock, but it seems we need to deal with another
> race condition too between backends and startup processes at the end
> of recovery.
> 
> Currently the backend skips controlling logical decoding status if the
> server is in recovery (by checking RecoveryInProgress()), but it's
> possible that a backend process tries to drop a logical slot after the
> startup process calling UpdateLogicalDecodingStatusEndOfRecovery() and
> before accepting writes.

Right. I also verified on local and found that
ReplicationSlotDropAcquired()->DisableLogicalDecodingIfNecessary() sometimes
skips to modify the status because RecoveryInProgress is still false.

> In this case, the backend ends up not
> disabling logical decoding and it remains enabled. I think we would
> somehow need to delay the logical decoding status change in this
> period until the recovery completes.

My primitive idea was to 1) keep startup acquiring the lock till end of recovery
and 2) DisableLogicalDecodingIfNecessary() acquires lock before checking the
recovery status, but it could not work well. Not sure but WaitForProcSignalBarrier()
stucked if the process acquired LogicalDecodingControlLock lock....

Best regards,
Hayato Kuroda
FUJITSU LIMITED


Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Thu, Jul 31, 2025 at 5:00 AM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Dear Sawada-san,
>
> > I thought we could fix this issue by checking the number of in-use
> > logical slots while holding ReplicationSlotControlLock and
> > LogicalDecodingControlLock, but it seems we need to deal with another
> > race condition too between backends and startup processes at the end
> > of recovery.
> >
> > Currently the backend skips controlling logical decoding status if the
> > server is in recovery (by checking RecoveryInProgress()), but it's
> > possible that a backend process tries to drop a logical slot after the
> > startup process calling UpdateLogicalDecodingStatusEndOfRecovery() and
> > before accepting writes.
>
> Right. I also verified on local and found that
> ReplicationSlotDropAcquired()->DisableLogicalDecodingIfNecessary() sometimes
> skips to modify the status because RecoveryInProgress is still false.
>
> > In this case, the backend ends up not
> > disabling logical decoding and it remains enabled. I think we would
> > somehow need to delay the logical decoding status change in this
> > period until the recovery completes.
>
> My primitive idea was to 1) keep startup acquiring the lock till end of recovery
> and 2) DisableLogicalDecodingIfNecessary() acquires lock before checking the
> recovery status, but it could not work well. Not sure but WaitForProcSignalBarrier()
> stucked if the process acquired LogicalDecodingControlLock lock....

I think that it's not realistic to keep holding a lwlock until the
recovery actually completes because we perform a checkpoint after
that.

In the latest version patch I attached, I introduce a flag on shared
memory to delay any logical decoding status change until the recovery
completes. The implementation got more complex than I expected but I
don't have a better idea. I'm open to other approaches. Also, I
incorporated all comments I got so far[1][2][3] and updated the
documentation.

Regards,

[1] https://www.postgresql.org/message-id/CALDaNm3BfG1hpWVEaqwBgXpcEGSQXDi536OzB2%3D8SFTz-v%2B3CA%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CAJpy0uDxap0YKLx5N45_Vz49QARjioUaOb1qpaiV0PBkYoivRg%40mail.gmail.com
[3]
https://www.postgresql.org/message-id/OSCPR01MB149663D242F6E97630758DD6EF55AA%40OSCPR01MB14966.jpnprd01.prod.outlook.com

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Вложения

Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Mon, Jul 28, 2025 at 9:44 PM vignesh C <vignesh21@gmail.com> wrote:
>
> On Fri, 25 Jul 2025 at 11:45, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Tue, Jul 22, 2025 at 11:44 PM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > On Tue, Jul 22, 2025 at 5:03 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > Yes, I agree. The main patch focuses on the part where we
> > > > automatically change the effective WAL level upon the logical slot
> > > > creation and deletion (and potentially remove 'logical' from
> > > > wal_level), and other things are implemented as additional features in
> > > > a separate patch.
> > >
> > >  I am keeping my focus on patch001 until we decide further on how to
> > > protect the slot.
> >
> > Yeah, I also dropped the additional feature patch from the patch set for now.
> >
> > > Apart from few comments in [1], please find one
> > > more concern:
> > >
> > > There is a race condition between creating and dropping a replication
> > > slot when enabling or disabling logical decoding. We might end up with
> > > logical decoding disabled even when a logical slot is present.
> > >
> > > Steps:
> > > 1) Set wal_level=replica on primary.
> > > 2) Create logical_slot1 which will enable logical decoding, causing
> > > effective_wal_level to become logical.
> > > 3) Drop logical_slot1 and pause execution inside
> > > DisableLogicalDecodingIfNecessary() right after the
> > > 'n_inuse_logical_slots' check using a debugger.
> > > 4) In another session, create logical_slot2. It will attempt to enable
> > > logical-decoding but since it is already enabled,
> > > EnsureLogicalDecodingEnabled() will be a no-op.
> > > 5) Release debugger of drop-slot, it will disable logical decoding.
> > >
> > > Ultimately, logical_slot2is present while logical decoding is disabled
> > > and thus we see this:
> > >
> > > postgres=# select slot_name from pg_replication_slots;
> > >    slot_name
> > > ---------------
> > >  logical_slot2
> > >
> > > postgres=# show effective_wal_level;
> > >  effective_wal_level
> > > ---------------------
> > >  replica
> > > (1 row)
> > >
> > > postgres=# select pg_logical_slot_get_changes('logical_slot2', NULL,
> > > NULL, 'proto_version', '4', 'publication_names', 'pub');
> > > ERROR:  logical decoding is not enabled
> > > HINT:  Set "wal_level" >= "logical" or create at least one logical slot.
> > >
> > > Shall we acquire LogicalDecodingControlLock in exclusive mode at a
> > > little earlier stage? Currently we acquire it after
> > > IsLogicalDecodingEnabled() check. I think we shall acquire it before
> > > this check in in both enable and disable flow?
> >
> > Thank you for testing the patch!
> >
> > I've reworked the locking part in the patch. The attached v4 patch
> > should address all review comments including your previous
> > comments[1].
>
> Few comments:
> 1) pg_waldump not handled for the new WAL record added
> XLOG_LOGICAL_DECODING_STATUS_CHANGE:
> +               XLogRegisterData(&logical_decoding, sizeof(bool));
> +               recptr = XLogInsert(RM_XLOG_ID,
> XLOG_LOGICAL_DECODING_STATUS_CHANGE);
> +               XLogFlush(recptr);
>
> rmgr: XLOG        len (rec/tot):     54/    54, tx:          0, lsn:
> 0/017633D8, prev 0/01763360, desc: PARAMETER_CHANGE
> max_connections=100 max_worker_processes=8 max_wal_senders=10
> max_prepared_xacts=10 max_locks_per_xact=64 wal_level=replica
> wal_log_hints=off track_commit_timestamp=off
> rmgr: XLOG        len (rec/tot):     27/    27, tx:          0, lsn:
> 0/01763410, prev 0/017633D8, desc: UNKNOWN (f0)
> rmgr: Standby     len (rec/tot):     50/    50, tx:          0, lsn:
> 0/01763430, prev 0/01763410, desc: RUNNING_XACTS nextXid 754
> latestCompletedXid 753 oldestRunningXid 754
>
> 2) Similarly pg_walinspect also should be handled for the new WAL record added:
> postgres=# SELECT * FROM pg_get_wal_records_info('0/017633D8', '0/01763468');
>  start_lsn  |  end_lsn   |  prev_lsn  | xid | resource_manager |
> record_type    | record_length | main_data_length | fpi_length |
>                     description
>                                         | block_ref
>
------------+------------+------------+-----+------------------+------------------+---------------+------------------+------------+-----------------------------------------------------------
>
---------------------------------------------------------------------------------------------------------------+-----------
>  0/017633D8 | 0/01763410 | 0/01763360 |   0 | XLOG             |
> PARAMETER_CHANGE |            54 |               28 |          0 |
> max_connections=100 max_worker_processes=8 max_wal_senders
> =10 max_prepared_xacts=10 max_locks_per_xact=64 wal_level=replica
> wal_log_hints=off track_commit_timestamp=off |
>  0/01763410 | 0/01763430 | 0/017633D8 |   0 | XLOG             |
> UNKNOWN (f0)     |            27 |                1 |          0 |
>
>                                         |
>  0/01763430 | 0/01763468 | 0/01763410 |   0 | Standby          |
> RUNNING_XACTS    |            50 |               24 |          0 |
> nextXid 754 latestCompletedXid 753 oldestRunningXid 754
>
>                                         |
> (3 rows)
>
> 3) Should this be the other way around? Would it be better to throw
> the error earlier, instead of waiting for the running transactions to
> finish?
> @@ -136,6 +137,9 @@ create_logical_replication_slot(char *name, char *plugin,
>                                                   temporary ?
> RS_TEMPORARY : RS_EPHEMERAL, two_phase,
>                                                   failover, false);
>
> +       EnsureLogicalDecodingEnabled();
> +       CheckLogicalDecodingRequirements();
>
> 4) The includes xlog_internal, xlogutils, atomics, lwlock, procsignal,
> shmem, standby and guc is not required, I was able to compile without
> it:
> + *       src/backend/replication/logical/logicalctl.c
> + *
> + *-------------------------------------------------------------------------
> + */
> +#include "postgres.h"
> +
> +#include "access/xlog_internal.h"
> +#include "access/xlogutils.h"
> +#include "access/xloginsert.h"
> +#include "catalog/pg_control.h"
> +#include "port/atomics.h"
> +#include "miscadmin.h"
> +#include "storage/lwlock.h"
> +#include "storage/procarray.h"
> +#include "storage/procsignal.h"
> +#include "storage/ipc.h"
> +#include "storage/lmgr.h"
> +#include "storage/shmem.h"
> +#include "storage/standby.h"
> +#include "replication/logicalctl.h"
> +#include "replication/slot.h"
> +#include "utils/guc.h"
> +#include "utils/wait_event_types.h"
>
> 5) I felt this change is not related to this patch:
> @@ -1144,7 +1152,7 @@ slotsync_reread_config(void)
>         if (old_sync_replication_slots != sync_replication_slots)
>         {
>                 ereport(LOG,
> -               /* translator: %s is a GUC variable name */
> +                               /* translator: %s is a GUC variable name */
>                                 errmsg("replication slot
> synchronization worker will shut down because \"%s\" is disabled",
> "sync_replication_slots"));
>                 proc_exit(0);
>
> 6) Can we include the high level design in the commit message and also
> the other possible designs that were considered before finalizing on
> this, it will help new reviewers to get a head start as the thread is
> a long thread.
>
> 7) I did not see documentation added, can we add the required
> documentation for this.
>
> 8) The new test file added should be included in meson.build file
>

Thank you for reviewing the patch! These comments have been addressed
in the latest patch I've just submitted[1].

Regards,

[1] https://www.postgresql.org/message-id/CAD21AoB%3DRf-SASOJR2WqvWcrA5Q3S2oUBACVLdJPaA8x6EchBA%40mail.gmail.com


--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Mon, Jul 28, 2025 at 10:02 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Fri, Jul 25, 2025 at 11:45 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > Thank you for testing the patch!
> >
> > I've reworked the locking part in the patch. The attached v4 patch
> > should address all review comments including your previous
> > comments[1].
> >
>
> Thank You for the patch. I have not reviewed fully, but please find
> few comments:
>
> 1)
> CreateReplicationSlot():
>
>   Assert(cmd->kind == REPLICATION_KIND_LOGICAL);
>  + EnsureLogicalDecodingEnabled();
>   CheckLogicalDecodingRequirements();
>   ReplicationSlotCreate(...);
>
> We may have another race-condition here. We have
> EnsureLogicalDecodingEnabled() before ReplicationSlotCreate(). It
> means we are enabling logical-decoding before incrementing
> LogicalDecodingCtl->n_logical_slots. So before we increment
> LogicalDecodingCtl->n_logical_slots through ReplicationSlotCreate(),
> another session may try to meanwhile drop the logical slot (another
> one and last one), and thus it may end up disabling logical-decoding
> as it will find n_logical_slots as 0.
>
> Steps:
> a) Create logical slot logical_slot1 on primary.
> b) Create  publication pub1.
> c) During Create-sub on subscriber, stop walsender after
> EnsureLogicalDecodingEnabled() by attaching debugger.
> d) Drop logical_slot1 on primary.
> e) Release the walsender debugger.

True. EnsureLogicalDecodingEnabled() has to be called after creating a
logical replication slot in order to reliably enable logical decoding.

>
>
> 2)
> create_logical_replication_slot:
>
> ReplicationSlotCreate(name, true
> ....
> + EnsureLogicalDecodingEnabled();
> + CheckLogicalDecodingRequirements();
>
> Earlier we had CheckLogicalDecodingRequirements() before we actually
> created the slot. Now we had it after slot-creation. It makes sense to
> do Logical-Decoding related checks post EnsureLogicalDecodingEnabled,
> but 'CheckSlotRequirements' should be done prior to slot-creation.
> Otherwise we will end up creating the slot and later dropping it when
> it should not have been created in the first place (for say wal_level
> < replica).
>
>
> 3)
> + EnsureLogicalDecodingEnabled();
> +
>
> We can get rid of this from slotsync as this is no-op on standby
>
>
> 4)
> pg_sync_replication_slots()
>         if (!RecoveryInProgress())
>                 ereport(ERROR,
>
> errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
>                                 errmsg("replication slots can only be
> synchronized to a standby server"));
>
> + EnsureLogicalDecodingEnabled();
>
> This API is called on standby alone, so EnsureLogicalDecodingEnabled
> is not needed here either.

Thank you for reviewing the patch! Agree with these comments. They
have been addressed in the latest patch I've just submitted[1].

Regards,

[1] https://www.postgresql.org/message-id/CAD21AoB%3DRf-SASOJR2WqvWcrA5Q3S2oUBACVLdJPaA8x6EchBA%40mail.gmail.com
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
shveta malik
Дата:
On Sat, Aug 2, 2025 at 4:53 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Jul 31, 2025 at 5:00 AM Hayato Kuroda (Fujitsu)
> <kuroda.hayato@fujitsu.com> wrote:
> >
> > Dear Sawada-san,
> >
> > > I thought we could fix this issue by checking the number of in-use
> > > logical slots while holding ReplicationSlotControlLock and
> > > LogicalDecodingControlLock, but it seems we need to deal with another
> > > race condition too between backends and startup processes at the end
> > > of recovery.
> > >
> > > Currently the backend skips controlling logical decoding status if the
> > > server is in recovery (by checking RecoveryInProgress()), but it's
> > > possible that a backend process tries to drop a logical slot after the
> > > startup process calling UpdateLogicalDecodingStatusEndOfRecovery() and
> > > before accepting writes.
> >
> > Right. I also verified on local and found that
> > ReplicationSlotDropAcquired()->DisableLogicalDecodingIfNecessary() sometimes
> > skips to modify the status because RecoveryInProgress is still false.
> >
> > > In this case, the backend ends up not
> > > disabling logical decoding and it remains enabled. I think we would
> > > somehow need to delay the logical decoding status change in this
> > > period until the recovery completes.
> >
> > My primitive idea was to 1) keep startup acquiring the lock till end of recovery
> > and 2) DisableLogicalDecodingIfNecessary() acquires lock before checking the
> > recovery status, but it could not work well. Not sure but WaitForProcSignalBarrier()
> > stucked if the process acquired LogicalDecodingControlLock lock....
>
> I think that it's not realistic to keep holding a lwlock until the
> recovery actually completes because we perform a checkpoint after
> that.
>
> In the latest version patch I attached, I introduce a flag on shared
> memory to delay any logical decoding status change until the recovery
> completes. The implementation got more complex than I expected but I
> don't have a better idea. I'm open to other approaches. Also, I
> incorporated all comments I got so far[1][2][3] and updated the
> documentation.
>

Yes, it is slightly complex, I will put more thoughts into it. That
said, I do have a related scenario in mind concerning the recent fix,
where we might still end up with an incorrect effective_wal_level
after promotion.

Say primary has 'wal_level'=replica and standby has
'wal_level'=logical. Since there are no slots on standby
'effective_wal_level' will still be replica. Now I created a slot both
on primary and standby making 'effective_wal_level'=logical. Now, when
the standby is promoted and the slot is dropped immediately after
UpdateLogicalDecodingStatusEndOfRecovery() releases the lock, we still
expect the effective_wal_level on the promoted standby (now the
primary) to remain logical, since its configured 'wal_level' is
logical and it has become the primary. But I think that may not be the
case because 'DisableLogicalDecodingIfNecessary-->start_logical_decoding_status_change()'
does not consider original wal_level on promoted standby in
retrial-attempt. I feel 'retry' should be above ' wal_level ==
WAL_LEVEL_LOGICAL' check in below code snippet:

+static bool
+start_logical_decoding_status_change(bool new_status)
+{
+ /*
+ * On the primary with 'logical' WAL level, we can skip logical decoding
+ * status change as it's always enabled. On standbys, we need to check the
+ * status on shared memory propagated from the primary and might handle
+ * status change delay.
+ */
+ if (!RecoveryInProgress() && wal_level == WAL_LEVEL_LOGICAL)
+ return false;
+
+retry:
+

Please note that I could not reproduce this scenario because as soon
as I put sleep or injection-point in
UpdateLogicalDecodingStatusEndOfRecovery(), I hit some ProcSignal
Barriers issue i.e. it never completes even when sleep is over. I get
this: 'LOG: still waiting for backend with PID 162838 to accept
ProcSignalBarrier'.

Please let me know if my understanding is not correct above.

thanks
Shveta



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Mon, Aug 4, 2025 at 3:38 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Sat, Aug 2, 2025 at 4:53 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Thu, Jul 31, 2025 at 5:00 AM Hayato Kuroda (Fujitsu)
> > <kuroda.hayato@fujitsu.com> wrote:
> > >
> > > Dear Sawada-san,
> > >
> > > > I thought we could fix this issue by checking the number of in-use
> > > > logical slots while holding ReplicationSlotControlLock and
> > > > LogicalDecodingControlLock, but it seems we need to deal with another
> > > > race condition too between backends and startup processes at the end
> > > > of recovery.
> > > >
> > > > Currently the backend skips controlling logical decoding status if the
> > > > server is in recovery (by checking RecoveryInProgress()), but it's
> > > > possible that a backend process tries to drop a logical slot after the
> > > > startup process calling UpdateLogicalDecodingStatusEndOfRecovery() and
> > > > before accepting writes.
> > >
> > > Right. I also verified on local and found that
> > > ReplicationSlotDropAcquired()->DisableLogicalDecodingIfNecessary() sometimes
> > > skips to modify the status because RecoveryInProgress is still false.
> > >
> > > > In this case, the backend ends up not
> > > > disabling logical decoding and it remains enabled. I think we would
> > > > somehow need to delay the logical decoding status change in this
> > > > period until the recovery completes.
> > >
> > > My primitive idea was to 1) keep startup acquiring the lock till end of recovery
> > > and 2) DisableLogicalDecodingIfNecessary() acquires lock before checking the
> > > recovery status, but it could not work well. Not sure but WaitForProcSignalBarrier()
> > > stucked if the process acquired LogicalDecodingControlLock lock....
> >
> > I think that it's not realistic to keep holding a lwlock until the
> > recovery actually completes because we perform a checkpoint after
> > that.
> >
> > In the latest version patch I attached, I introduce a flag on shared
> > memory to delay any logical decoding status change until the recovery
> > completes. The implementation got more complex than I expected but I
> > don't have a better idea. I'm open to other approaches. Also, I
> > incorporated all comments I got so far[1][2][3] and updated the
> > documentation.
> >
>
> Yes, it is slightly complex, I will put more thoughts into it. That
> said, I do have a related scenario in mind concerning the recent fix,
> where we might still end up with an incorrect effective_wal_level
> after promotion.
>
> Say primary has 'wal_level'=replica and standby has
> 'wal_level'=logical. Since there are no slots on standby
> 'effective_wal_level' will still be replica. Now I created a slot both
> on primary and standby making 'effective_wal_level'=logical. Now, when
> the standby is promoted and the slot is dropped immediately after
> UpdateLogicalDecodingStatusEndOfRecovery() releases the lock, we still
> expect the effective_wal_level on the promoted standby (now the
> primary) to remain logical, since its configured 'wal_level' is
> logical and it has become the primary. But I think that may not be the
> case because 'DisableLogicalDecodingIfNecessary-->start_logical_decoding_status_change()'
> does not consider original wal_level on promoted standby in
> retrial-attempt. I feel 'retry' should be above ' wal_level ==
> WAL_LEVEL_LOGICAL' check in below code snippet:
>
> +static bool
> +start_logical_decoding_status_change(bool new_status)
> +{
> + /*
> + * On the primary with 'logical' WAL level, we can skip logical decoding
> + * status change as it's always enabled. On standbys, we need to check the
> + * status on shared memory propagated from the primary and might handle
> + * status change delay.
> + */
> + if (!RecoveryInProgress() && wal_level == WAL_LEVEL_LOGICAL)
> + return false;
> +
> +retry:
> +
>
> Please note that I could not reproduce this scenario because as soon
> as I put sleep or injection-point in
> UpdateLogicalDecodingStatusEndOfRecovery(), I hit some ProcSignal
> Barriers issue i.e. it never completes even when sleep is over. I get
> this: 'LOG: still waiting for backend with PID 162838 to accept
> ProcSignalBarrier'.

Thank you for the comment! I think you're right. That check should be
done after 'retry'. WIll incorporate the change in the next version
patch.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
shveta malik
Дата:
On Tue, Aug 5, 2025 at 5:14 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Mon, Aug 4, 2025 at 3:38 AM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Sat, Aug 2, 2025 at 4:53 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Thu, Jul 31, 2025 at 5:00 AM Hayato Kuroda (Fujitsu)
> > > <kuroda.hayato@fujitsu.com> wrote:
> > > >
> > > > Dear Sawada-san,
> > > >
> > > > > I thought we could fix this issue by checking the number of in-use
> > > > > logical slots while holding ReplicationSlotControlLock and
> > > > > LogicalDecodingControlLock, but it seems we need to deal with another
> > > > > race condition too between backends and startup processes at the end
> > > > > of recovery.
> > > > >
> > > > > Currently the backend skips controlling logical decoding status if the
> > > > > server is in recovery (by checking RecoveryInProgress()), but it's
> > > > > possible that a backend process tries to drop a logical slot after the
> > > > > startup process calling UpdateLogicalDecodingStatusEndOfRecovery() and
> > > > > before accepting writes.
> > > >
> > > > Right. I also verified on local and found that
> > > > ReplicationSlotDropAcquired()->DisableLogicalDecodingIfNecessary() sometimes
> > > > skips to modify the status because RecoveryInProgress is still false.
> > > >
> > > > > In this case, the backend ends up not
> > > > > disabling logical decoding and it remains enabled. I think we would
> > > > > somehow need to delay the logical decoding status change in this
> > > > > period until the recovery completes.
> > > >
> > > > My primitive idea was to 1) keep startup acquiring the lock till end of recovery
> > > > and 2) DisableLogicalDecodingIfNecessary() acquires lock before checking the
> > > > recovery status, but it could not work well. Not sure but WaitForProcSignalBarrier()
> > > > stucked if the process acquired LogicalDecodingControlLock lock....
> > >
> > > I think that it's not realistic to keep holding a lwlock until the
> > > recovery actually completes because we perform a checkpoint after
> > > that.
> > >
> > > In the latest version patch I attached, I introduce a flag on shared
> > > memory to delay any logical decoding status change until the recovery
> > > completes. The implementation got more complex than I expected but I
> > > don't have a better idea. I'm open to other approaches. Also, I
> > > incorporated all comments I got so far[1][2][3] and updated the
> > > documentation.
> > >
> >
> > Yes, it is slightly complex, I will put more thoughts into it. That
> > said, I do have a related scenario in mind concerning the recent fix,
> > where we might still end up with an incorrect effective_wal_level
> > after promotion.
> >
> > Say primary has 'wal_level'=replica and standby has
> > 'wal_level'=logical. Since there are no slots on standby
> > 'effective_wal_level' will still be replica. Now I created a slot both
> > on primary and standby making 'effective_wal_level'=logical. Now, when
> > the standby is promoted and the slot is dropped immediately after
> > UpdateLogicalDecodingStatusEndOfRecovery() releases the lock, we still
> > expect the effective_wal_level on the promoted standby (now the
> > primary) to remain logical, since its configured 'wal_level' is
> > logical and it has become the primary. But I think that may not be the
> > case because 'DisableLogicalDecodingIfNecessary-->start_logical_decoding_status_change()'
> > does not consider original wal_level on promoted standby in
> > retrial-attempt. I feel 'retry' should be above ' wal_level ==
> > WAL_LEVEL_LOGICAL' check in below code snippet:
> >
> > +static bool
> > +start_logical_decoding_status_change(bool new_status)
> > +{
> > + /*
> > + * On the primary with 'logical' WAL level, we can skip logical decoding
> > + * status change as it's always enabled. On standbys, we need to check the
> > + * status on shared memory propagated from the primary and might handle
> > + * status change delay.
> > + */
> > + if (!RecoveryInProgress() && wal_level == WAL_LEVEL_LOGICAL)
> > + return false;
> > +
> > +retry:
> > +
> >
> > Please note that I could not reproduce this scenario because as soon
> > as I put sleep or injection-point in
> > UpdateLogicalDecodingStatusEndOfRecovery(), I hit some ProcSignal
> > Barriers issue i.e. it never completes even when sleep is over. I get
> > this: 'LOG: still waiting for backend with PID 162838 to accept
> > ProcSignalBarrier'.
>
> Thank you for the comment! I think you're right. That check should be
> done after 'retry'. WIll incorporate the change in the next version
> patch.
>

Thanks.

1)

start_logical_decoding_status_change():

+ LWLockRelease(LogicalDecodingControlLock);
+
+ /* Mark the state transition is in-progress */
+ LogicalDecodingCtl->transition_in_progress = true;

I think we should set transition_in_progress before releasing lock,
else it may hit a race condition between create and drop slot and can
end up having a slot but with logical decoding disabled.

Steps:
1) create logical_slot1
2) drop logical slot1, hold the debugger immediately before setting
transition_in_progress start_logical_decoding_status_change()
3) create logical_slot2
4) release debugger held during drop.
5) now, we will have a slot but effective_wal_level will be replica.


2)
CheckLogicalDecodingRequirements has this change:

- if (wal_level < WAL_LEVEL_LOGICAL)
+ if (wal_level < WAL_LEVEL_REPLICA)
  ereport(ERROR,
  (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
- errmsg("logical decoding requires \"wal_level\" >= \"logical\"")));
+ errmsg("logical decoding requires \"wal_level\" >= \"replica\"")));


But we already have same wal_level check  in CheckSlotRequirements:

        if (wal_level < WAL_LEVEL_REPLICA)
                ereport(ERROR,

(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
                                 errmsg("replication slots can only be
used if \"wal_level\" >= \"replica\"")));

Thus the change in CheckLogicalDecodingRequirements for 'wal_level <
WAL_LEVEL_REPLICA' will never be reached. Is it needed?

thanks
Shveta



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Tue, Aug 5, 2025 at 3:11 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Tue, Aug 5, 2025 at 5:14 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Mon, Aug 4, 2025 at 3:38 AM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > On Sat, Aug 2, 2025 at 4:53 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Thu, Jul 31, 2025 at 5:00 AM Hayato Kuroda (Fujitsu)
> > > > <kuroda.hayato@fujitsu.com> wrote:
> > > > >
> > > > > Dear Sawada-san,
> > > > >
> > > > > > I thought we could fix this issue by checking the number of in-use
> > > > > > logical slots while holding ReplicationSlotControlLock and
> > > > > > LogicalDecodingControlLock, but it seems we need to deal with another
> > > > > > race condition too between backends and startup processes at the end
> > > > > > of recovery.
> > > > > >
> > > > > > Currently the backend skips controlling logical decoding status if the
> > > > > > server is in recovery (by checking RecoveryInProgress()), but it's
> > > > > > possible that a backend process tries to drop a logical slot after the
> > > > > > startup process calling UpdateLogicalDecodingStatusEndOfRecovery() and
> > > > > > before accepting writes.
> > > > >
> > > > > Right. I also verified on local and found that
> > > > > ReplicationSlotDropAcquired()->DisableLogicalDecodingIfNecessary() sometimes
> > > > > skips to modify the status because RecoveryInProgress is still false.
> > > > >
> > > > > > In this case, the backend ends up not
> > > > > > disabling logical decoding and it remains enabled. I think we would
> > > > > > somehow need to delay the logical decoding status change in this
> > > > > > period until the recovery completes.
> > > > >
> > > > > My primitive idea was to 1) keep startup acquiring the lock till end of recovery
> > > > > and 2) DisableLogicalDecodingIfNecessary() acquires lock before checking the
> > > > > recovery status, but it could not work well. Not sure but WaitForProcSignalBarrier()
> > > > > stucked if the process acquired LogicalDecodingControlLock lock....
> > > >
> > > > I think that it's not realistic to keep holding a lwlock until the
> > > > recovery actually completes because we perform a checkpoint after
> > > > that.
> > > >
> > > > In the latest version patch I attached, I introduce a flag on shared
> > > > memory to delay any logical decoding status change until the recovery
> > > > completes. The implementation got more complex than I expected but I
> > > > don't have a better idea. I'm open to other approaches. Also, I
> > > > incorporated all comments I got so far[1][2][3] and updated the
> > > > documentation.
> > > >
> > >
> > > Yes, it is slightly complex, I will put more thoughts into it. That
> > > said, I do have a related scenario in mind concerning the recent fix,
> > > where we might still end up with an incorrect effective_wal_level
> > > after promotion.
> > >
> > > Say primary has 'wal_level'=replica and standby has
> > > 'wal_level'=logical. Since there are no slots on standby
> > > 'effective_wal_level' will still be replica. Now I created a slot both
> > > on primary and standby making 'effective_wal_level'=logical. Now, when
> > > the standby is promoted and the slot is dropped immediately after
> > > UpdateLogicalDecodingStatusEndOfRecovery() releases the lock, we still
> > > expect the effective_wal_level on the promoted standby (now the
> > > primary) to remain logical, since its configured 'wal_level' is
> > > logical and it has become the primary. But I think that may not be the
> > > case because 'DisableLogicalDecodingIfNecessary-->start_logical_decoding_status_change()'
> > > does not consider original wal_level on promoted standby in
> > > retrial-attempt. I feel 'retry' should be above ' wal_level ==
> > > WAL_LEVEL_LOGICAL' check in below code snippet:
> > >
> > > +static bool
> > > +start_logical_decoding_status_change(bool new_status)
> > > +{
> > > + /*
> > > + * On the primary with 'logical' WAL level, we can skip logical decoding
> > > + * status change as it's always enabled. On standbys, we need to check the
> > > + * status on shared memory propagated from the primary and might handle
> > > + * status change delay.
> > > + */
> > > + if (!RecoveryInProgress() && wal_level == WAL_LEVEL_LOGICAL)
> > > + return false;
> > > +
> > > +retry:
> > > +
> > >
> > > Please note that I could not reproduce this scenario because as soon
> > > as I put sleep or injection-point in
> > > UpdateLogicalDecodingStatusEndOfRecovery(), I hit some ProcSignal
> > > Barriers issue i.e. it never completes even when sleep is over. I get
> > > this: 'LOG: still waiting for backend with PID 162838 to accept
> > > ProcSignalBarrier'.
> >
> > Thank you for the comment! I think you're right. That check should be
> > done after 'retry'. WIll incorporate the change in the next version
> > patch.
> >
>
> Thanks.
>

Thank you for reviewing the patch!

> 1)
>
> start_logical_decoding_status_change():
>
> + LWLockRelease(LogicalDecodingControlLock);
> +
> + /* Mark the state transition is in-progress */
> + LogicalDecodingCtl->transition_in_progress = true;
>
> I think we should set transition_in_progress before releasing lock,
> else it may hit a race condition between create and drop slot and can
> end up having a slot but with logical decoding disabled.

Ugh, you're right. It should be protected by the lwlock.

>
> 2)
> CheckLogicalDecodingRequirements has this change:
>
> - if (wal_level < WAL_LEVEL_LOGICAL)
> + if (wal_level < WAL_LEVEL_REPLICA)
>   ereport(ERROR,
>   (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> - errmsg("logical decoding requires \"wal_level\" >= \"logical\"")));
> + errmsg("logical decoding requires \"wal_level\" >= \"replica\"")));
>
>
> But we already have same wal_level check  in CheckSlotRequirements:
>
>         if (wal_level < WAL_LEVEL_REPLICA)
>                 ereport(ERROR,
>
> (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
>                                  errmsg("replication slots can only be
> used if \"wal_level\" >= \"replica\"")));
>
> Thus the change in CheckLogicalDecodingRequirements for 'wal_level <
> WAL_LEVEL_REPLICA' will never be reached. Is it needed?

No, I agree to remove it.

I've attached the updated version patch.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Вложения

Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
shveta malik
Дата:
On Wed, Aug 6, 2025 at 6:18 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
>
> I've attached the updated version patch.
>

Thank You for the patch. The patch does not apply to the latest head
due to conflict with slot-sync fix (commit-Id: 4614d53d).

thanks
Shveta



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
shveta malik
Дата:
Please find a few comments on v6:

1)
+/*
+ * Initialize logical decoding status on shmem at server startup. This
+ * must be called ONCE during postmaster or standalone-backend startup,
+ * before initializing replication slots.
+ */
+void
+StartupLogicalDecodingStatus(bool last_status)

The comment says that it needs to be called 'before initializing
replication slots' but instead it is called after initializing
replication slots (i.e. after StartupReplicationSlots).

Also, can you please help me understand the need of
'StartupLogicalDecodingStatus' when we are doing
'UpdateLogicalDecodingStatusEndOfRecovery' later in StartupXLOG. Why
do we need to set last_status temporarily when the new status can be
different which will be set in
UpdateLogicalDecodingStatusEndOfRecovery


2)
CreatePublication() has this:

+ errmsg("logical decoding needs to be enabled to publish logical changes"),
+ errhint("Set \"wal_level\" to \"logical\" or create a logical
replication slot with \"replica\" \"wal_level\" before creating
subscriptions.")));

While rest of the places has this:

+ errhint("Set \"wal_level\" >= \"logical\" or create at least one
logical slot on the primary.")));

Shall we make these errhint consistent?  Either all mention
'wal_level=replica' condition along with slot-creation part or none.


3)
xlog_decode():

+ case XLOG_LOGICAL_DECODING_STATUS_CHANGE:
  /*
  * This can occur only on a standby, as a primary would
- * not allow to restart after changing wal_level < logical
+ * not allow to restart after changing wal_level < replica
  * if there is pre-existing logical slot.
  */
  Assert(RecoveryInProgress());
  ereport(ERROR,
  (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
- errmsg("logical decoding on standby requires \"wal_level\" >=
\"logical\" on the primary")));
+ errmsg("logical decoding must be enabled on the primary")));

Is the comment correct?
a)
XLOG_LOGICAL_DECODING_STATUS_CHANGE can be result of logical-slot drop
on primary and not necessarily making wal_level < replica

b)
I see that even standby does not allow to restart when changing
wal_level < replica as against what comment says. Have I understood
the intent correctly?

standby LOG:
FATAL:  logical replication slot "failover_slot_st" exists, but
"wal_level" < "replica"
HINT:  Change "wal_level" to be "replica" or higher.


4)
start_logical_decoding_status_change():
+ if (LogicalDecodingCtl->transition_in_progress)
+ {
+ LWLockRelease(LogicalDecodingControlLock);

read_logical_decoding_status_transition() takes care of checking
transition_in_progress, I think we missed to remove above from
start_logical_decoding_status_change().


5)
+ /* Return if we don't need to change the status */
+ if (LogicalDecodingCtl->logical_decoding_enabled == new_status)
+ {

Same with this code-logic in start_logical_decoding_status_change(),
we shall remove it.

6)
+ * If we're in recovery and the startup process is still taking
+ * responsibility to update the status, we cannot change.
+ */
+ if (!delay_status_change)
+ return false;
+

This comment is confusing as when in recovery, we can not change state
otherwise as well even if delay_status_change is false. IIUC, the
scenario can arise only during promotion, if so, shall we say:

"If we're in recovery and a state transition (e.g., promotion) is in
progress, wait for the transition to complete and retry on the new
primary. Otherwise, disallow the status change entirely, as a standby
cannot modify the logical decoding status."

7)
The name 'delay_status_change' does not indicate which status or the
intent of delay. More name options are: defer_logical_status_change,
wait_for_recovery_transition/completion,
recovery_transition_in_progress

8)
DisableLogicalDecodingIfNecessary():
+
+ /* Write the WAL to disable logical decoding on standbys too */
+ if (XLogStandbyInfoActive() && !recoveryInProgress)
+ {

Do we need 'recoveryInProgress' check here?
start_logical_decoding_status_change() has taken care of that.

9)
Comments atop DisableLogicalDecodingIfNecessary:

 * This function expects to be called after dropping a possibly-last logical
 * replication slot. Logical decoding can be disabled only when wal_level is set
 * to 'replica' and there is no logical replication slot on the system.

The comment is not completely true, shall we amend the comment to say
something like:

This function is called after a logical slot is dropped, but it only
disables logical decoding on primary if it was the last remaining
logical slot and wal_level < logical. Otherwise, it performs no
action.

10)
When we try to create or drop a logical slot on standby, and if
delay_status_change is false, shall we immediately exit? Currently it
does a lot of checks including CheckLogicalSlotExists() which can be
completely avoided. I think it is worth having a quick
'RecoveryInProgress() && !delay_status_change' check in the beginning.

thanks
Shveta



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Tue, Aug 5, 2025 at 11:23 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> Please find a few comments on v6:
>
> 1)
> +/*
> + * Initialize logical decoding status on shmem at server startup. This
> + * must be called ONCE during postmaster or standalone-backend startup,
> + * before initializing replication slots.
> + */
> +void
> +StartupLogicalDecodingStatus(bool last_status)
>
> The comment says that it needs to be called 'before initializing
> replication slots' but instead it is called after initializing
> replication slots (i.e. after StartupReplicationSlots).

Removed.

> Also, can you please help me understand the need of
> 'StartupLogicalDecodingStatus' when we are doing
> 'UpdateLogicalDecodingStatusEndOfRecovery' later in StartupXLOG. Why
> do we need to set last_status temporarily when the new status can be
> different which will be set in
> UpdateLogicalDecodingStatusEndOfRecovery

IIUC we need to initialize the logical decoding status with the status
we used to use when the server shutdown or when the basebackup was
taken. This status would be used during recovery and might be changed
by replaying the XLOG_LOGICAL_DECODING_STATUS_CHANGE record. At the
end of recovery, we update the status based on the server's wal_level
and the number of logical replication slots so the new status could be
different from the status used during the recovery.

>
>
> 2)
> CreatePublication() has this:
>
> + errmsg("logical decoding needs to be enabled to publish logical changes"),
> + errhint("Set \"wal_level\" to \"logical\" or create a logical
> replication slot with \"replica\" \"wal_level\" before creating
> subscriptions.")));
>
> While rest of the places has this:
>
> + errhint("Set \"wal_level\" >= \"logical\" or create at least one
> logical slot on the primary.")));
>
> Shall we make these errhint consistent?  Either all mention
> 'wal_level=replica' condition along with slot-creation part or none.

Fixed.

>
>
> 3)
> xlog_decode():
>
> + case XLOG_LOGICAL_DECODING_STATUS_CHANGE:
>   /*
>   * This can occur only on a standby, as a primary would
> - * not allow to restart after changing wal_level < logical
> + * not allow to restart after changing wal_level < replica
>   * if there is pre-existing logical slot.
>   */
>   Assert(RecoveryInProgress());
>   ereport(ERROR,
>   (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> - errmsg("logical decoding on standby requires \"wal_level\" >=
> \"logical\" on the primary")));
> + errmsg("logical decoding must be enabled on the primary")));
>
> Is the comment correct?
>
> a)
> XLOG_LOGICAL_DECODING_STATUS_CHANGE can be result of logical-slot drop
> on primary and not necessarily making wal_level < replica
>
> b)
> I see that even standby does not allow to restart when changing
> wal_level < replica as against what comment says. Have I understood
> the intent correctly?
>
> standby LOG:
> FATAL:  logical replication slot "failover_slot_st" exists, but
> "wal_level" < "replica"
> HINT:  Change "wal_level" to be "replica" or higher.

I think I don't get your point. The comment looks correct to me.

On a primary server, a logical slot can only decode WAL records that
were generated while that slot existed. A
XLOG_LOGICAL_DECODING_STATUS_CHANGE record with logical_decoding=false
is generated in two cases: after the last logical slot is dropped, or
when the server starts with no logical slot and wal_level<='replica'.
In either case, no logical slots can exist that would be able to
decode these WAL records. However, on standby servers, it is possible
to decode these XLOG_LOGICAL_DECODING_STATUS_CHANGE records with
logical_decoding=false, as standbys can decode WAL records
independently of the primary.

>
>
> 4)
> start_logical_decoding_status_change():
> + if (LogicalDecodingCtl->transition_in_progress)
> + {
> + LWLockRelease(LogicalDecodingControlLock);
>
> read_logical_decoding_status_transition() takes care of checking
> transition_in_progress, I think we missed to remove above from
> start_logical_decoding_status_change().

Fixed.

>
>
> 5)
> + /* Return if we don't need to change the status */
> + if (LogicalDecodingCtl->logical_decoding_enabled == new_status)
> + {
>
> Same with this code-logic in start_logical_decoding_status_change(),
> we shall remove it.

Fixed.

>
> 6)
> + * If we're in recovery and the startup process is still taking
> + * responsibility to update the status, we cannot change.
> + */
> + if (!delay_status_change)
> + return false;
> +
>
> This comment is confusing as when in recovery, we can not change state
> otherwise as well even if delay_status_change is false. IIUC, the
> scenario can arise only during promotion, if so, shall we say:
>
> "If we're in recovery and a state transition (e.g., promotion) is in
> progress, wait for the transition to complete and retry on the new
> primary. Otherwise, disallow the status change entirely, as a standby
> cannot modify the logical decoding status."

Fixed.

>
> 7)
> The name 'delay_status_change' does not indicate which status or the
> intent of delay. More name options are: defer_logical_status_change,
> wait_for_recovery_transition/completion,
> recovery_transition_in_progress

I think 'delay' is used in other similar examples in PostgreSQL code.
For instance, we have DELAY_CHKPT_START/COMPLETE/IN_COMMIT that are
set by transactions to delay the actual checkpoint process until these
transactions complete certain operations. In our case, the flag is set
by the startup process in order to delay the actual status change
process by other processes until the recovery completes. Which is a
very similar usage so I believe 'delay' is appropriate here.

Regarding the 'status', I guess it's relatively obvious in this
context that the status indicates the logical decoding status so I'm
not sure that readers would confuse this name.

>
> 8)
> DisableLogicalDecodingIfNecessary():
> +
> + /* Write the WAL to disable logical decoding on standbys too */
> + if (XLogStandbyInfoActive() && !recoveryInProgress)
> + {
>
> Do we need 'recoveryInProgress' check here?
> start_logical_decoding_status_change() has taken care of that.

Removed.

>
> 9)
> Comments atop DisableLogicalDecodingIfNecessary:
>
>  * This function expects to be called after dropping a possibly-last logical
>  * replication slot. Logical decoding can be disabled only when wal_level is set
>  * to 'replica' and there is no logical replication slot on the system.
>
> The comment is not completely true, shall we amend the comment to say
> something like:
>
> This function is called after a logical slot is dropped, but it only
> disables logical decoding on primary if it was the last remaining
> logical slot and wal_level < logical. Otherwise, it performs no
> action.

Thank you for the suggestion. I modified the comment based on the suggestion

>
> 10)
> When we try to create or drop a logical slot on standby, and if
> delay_status_change is false, shall we immediately exit? Currently it
> does a lot of checks including CheckLogicalSlotExists() which can be
> completely avoided. I think it is worth having a quick
> 'RecoveryInProgress() && !delay_status_change' check in the beginning.

Yeah, we can simplify the start_logical_decoding_status_change() logic more.

I've attached the updated patch.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Вложения

Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
shveta malik
Дата:
On Fri, Aug 8, 2025 at 3:30 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, Aug 5, 2025 at 11:23 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > Please find a few comments on v6:
> >
> > 1)
> > +/*
> > + * Initialize logical decoding status on shmem at server startup. This
> > + * must be called ONCE during postmaster or standalone-backend startup,
> > + * before initializing replication slots.
> > + */
> > +void
> > +StartupLogicalDecodingStatus(bool last_status)
> >
> > The comment says that it needs to be called 'before initializing
> > replication slots' but instead it is called after initializing
> > replication slots (i.e. after StartupReplicationSlots).
>
> Removed.
>
> > Also, can you please help me understand the need of
> > 'StartupLogicalDecodingStatus' when we are doing
> > 'UpdateLogicalDecodingStatusEndOfRecovery' later in StartupXLOG. Why
> > do we need to set last_status temporarily when the new status can be
> > different which will be set in
> > UpdateLogicalDecodingStatusEndOfRecovery
>
> IIUC we need to initialize the logical decoding status with the status
> we used to use when the server shutdown or when the basebackup was
> taken. This status would be used during recovery and might be changed
> by replaying the XLOG_LOGICAL_DECODING_STATUS_CHANGE record. At the
> end of recovery, we update the status based on the server's wal_level
> and the number of logical replication slots so the new status could be
> different from the status used during the recovery.
>

Okay.

> >
> >
> > 2)
> > CreatePublication() has this:
> >
> > + errmsg("logical decoding needs to be enabled to publish logical changes"),
> > + errhint("Set \"wal_level\" to \"logical\" or create a logical
> > replication slot with \"replica\" \"wal_level\" before creating
> > subscriptions.")));
> >
> > While rest of the places has this:
> >
> > + errhint("Set \"wal_level\" >= \"logical\" or create at least one
> > logical slot on the primary.")));
> >
> > Shall we make these errhint consistent?  Either all mention
> > 'wal_level=replica' condition along with slot-creation part or none.
>
> Fixed.
>
> >
> >
> > 3)
> > xlog_decode():
> >
> > + case XLOG_LOGICAL_DECODING_STATUS_CHANGE:
> >   /*
> >   * This can occur only on a standby, as a primary would
> > - * not allow to restart after changing wal_level < logical
> > + * not allow to restart after changing wal_level < replica
> >   * if there is pre-existing logical slot.
> >   */
> >   Assert(RecoveryInProgress());
> >   ereport(ERROR,
> >   (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> > - errmsg("logical decoding on standby requires \"wal_level\" >=
> > \"logical\" on the primary")));
> > + errmsg("logical decoding must be enabled on the primary")));
> >
> > Is the comment correct?
> >
> > a)
> > XLOG_LOGICAL_DECODING_STATUS_CHANGE can be result of logical-slot drop
> > on primary and not necessarily making wal_level < replica
> >
> > b)
> > I see that even standby does not allow to restart when changing
> > wal_level < replica as against what comment says. Have I understood
> > the intent correctly?
> >
> > standby LOG:
> > FATAL:  logical replication slot "failover_slot_st" exists, but
> > "wal_level" < "replica"
> > HINT:  Change "wal_level" to be "replica" or higher.
>
> I think I don't get your point. The comment looks correct to me.
>
> On a primary server, a logical slot can only decode WAL records that
> were generated while that slot existed. A
> XLOG_LOGICAL_DECODING_STATUS_CHANGE record with logical_decoding=false
> is generated in two cases: after the last logical slot is dropped, or
> when the server starts with no logical slot and wal_level<='replica'.
> In either case, no logical slots can exist that would be able to
> decode these WAL records. However, on standby servers, it is possible
> to decode these XLOG_LOGICAL_DECODING_STATUS_CHANGE records with
> logical_decoding=false, as standbys can decode WAL records
> independently of the primary.
>

Okay, I see your point. Thanks for explaining.

> >
> >
> > 4)
> > start_logical_decoding_status_change():
> > + if (LogicalDecodingCtl->transition_in_progress)
> > + {
> > + LWLockRelease(LogicalDecodingControlLock);
> >
> > read_logical_decoding_status_transition() takes care of checking
> > transition_in_progress, I think we missed to remove above from
> > start_logical_decoding_status_change().
>
> Fixed.
>
> >
> >
> > 5)
> > + /* Return if we don't need to change the status */
> > + if (LogicalDecodingCtl->logical_decoding_enabled == new_status)
> > + {
> >
> > Same with this code-logic in start_logical_decoding_status_change(),
> > we shall remove it.
>
> Fixed.
>
> >
> > 6)
> > + * If we're in recovery and the startup process is still taking
> > + * responsibility to update the status, we cannot change.
> > + */
> > + if (!delay_status_change)
> > + return false;
> > +
> >
> > This comment is confusing as when in recovery, we can not change state
> > otherwise as well even if delay_status_change is false. IIUC, the
> > scenario can arise only during promotion, if so, shall we say:
> >
> > "If we're in recovery and a state transition (e.g., promotion) is in
> > progress, wait for the transition to complete and retry on the new
> > primary. Otherwise, disallow the status change entirely, as a standby
> > cannot modify the logical decoding status."
>
> Fixed.
>
> >
> > 7)
> > The name 'delay_status_change' does not indicate which status or the
> > intent of delay. More name options are: defer_logical_status_change,
> > wait_for_recovery_transition/completion,
> > recovery_transition_in_progress
>
> I think 'delay' is used in other similar examples in PostgreSQL code.
> For instance, we have DELAY_CHKPT_START/COMPLETE/IN_COMMIT that are
> set by transactions to delay the actual checkpoint process until these
> transactions complete certain operations. In our case, the flag is set
> by the startup process in order to delay the actual status change
> process by other processes until the recovery completes. Which is a
> very similar usage so I believe 'delay' is appropriate here.
>
> Regarding the 'status', I guess it's relatively obvious in this
> context that the status indicates the logical decoding status so I'm
> not sure that readers would confuse this name.
>

Okay, we can retain the same.

> >
> > 8)
> > DisableLogicalDecodingIfNecessary():
> > +
> > + /* Write the WAL to disable logical decoding on standbys too */
> > + if (XLogStandbyInfoActive() && !recoveryInProgress)
> > + {
> >
> > Do we need 'recoveryInProgress' check here?
> > start_logical_decoding_status_change() has taken care of that.
>
> Removed.
>
> >
> > 9)
> > Comments atop DisableLogicalDecodingIfNecessary:
> >
> >  * This function expects to be called after dropping a possibly-last logical
> >  * replication slot. Logical decoding can be disabled only when wal_level is set
> >  * to 'replica' and there is no logical replication slot on the system.
> >
> > The comment is not completely true, shall we amend the comment to say
> > something like:
> >
> > This function is called after a logical slot is dropped, but it only
> > disables logical decoding on primary if it was the last remaining
> > logical slot and wal_level < logical. Otherwise, it performs no
> > action.
>
> Thank you for the suggestion. I modified the comment based on the suggestion
>
> >
> > 10)
> > When we try to create or drop a logical slot on standby, and if
> > delay_status_change is false, shall we immediately exit? Currently it
> > does a lot of checks including CheckLogicalSlotExists() which can be
> > completely avoided. I think it is worth having a quick
> > 'RecoveryInProgress() && !delay_status_change' check in the beginning.
>
> Yeah, we can simplify the start_logical_decoding_status_change() logic more.
>
> I've attached the updated patch.
>
> Regards,
>
> --
> Masahiko Sawada
> Amazon Web Services: https://aws.amazon.com



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Shlok Kyal
Дата:
On Fri, 8 Aug 2025 at 03:30, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, Aug 5, 2025 at 11:23 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > Please find a few comments on v6:
> >
> > 1)
> > +/*
> > + * Initialize logical decoding status on shmem at server startup. This
> > + * must be called ONCE during postmaster or standalone-backend startup,
> > + * before initializing replication slots.
> > + */
> > +void
> > +StartupLogicalDecodingStatus(bool last_status)
> >
> > The comment says that it needs to be called 'before initializing
> > replication slots' but instead it is called after initializing
> > replication slots (i.e. after StartupReplicationSlots).
>
> Removed.
>
> > Also, can you please help me understand the need of
> > 'StartupLogicalDecodingStatus' when we are doing
> > 'UpdateLogicalDecodingStatusEndOfRecovery' later in StartupXLOG. Why
> > do we need to set last_status temporarily when the new status can be
> > different which will be set in
> > UpdateLogicalDecodingStatusEndOfRecovery
>
> IIUC we need to initialize the logical decoding status with the status
> we used to use when the server shutdown or when the basebackup was
> taken. This status would be used during recovery and might be changed
> by replaying the XLOG_LOGICAL_DECODING_STATUS_CHANGE record. At the
> end of recovery, we update the status based on the server's wal_level
> and the number of logical replication slots so the new status could be
> different from the status used during the recovery.
>
> >
> >
> > 2)
> > CreatePublication() has this:
> >
> > + errmsg("logical decoding needs to be enabled to publish logical changes"),
> > + errhint("Set \"wal_level\" to \"logical\" or create a logical
> > replication slot with \"replica\" \"wal_level\" before creating
> > subscriptions.")));
> >
> > While rest of the places has this:
> >
> > + errhint("Set \"wal_level\" >= \"logical\" or create at least one
> > logical slot on the primary.")));
> >
> > Shall we make these errhint consistent?  Either all mention
> > 'wal_level=replica' condition along with slot-creation part or none.
>
> Fixed.
>
> >
> >
> > 3)
> > xlog_decode():
> >
> > + case XLOG_LOGICAL_DECODING_STATUS_CHANGE:
> >   /*
> >   * This can occur only on a standby, as a primary would
> > - * not allow to restart after changing wal_level < logical
> > + * not allow to restart after changing wal_level < replica
> >   * if there is pre-existing logical slot.
> >   */
> >   Assert(RecoveryInProgress());
> >   ereport(ERROR,
> >   (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> > - errmsg("logical decoding on standby requires \"wal_level\" >=
> > \"logical\" on the primary")));
> > + errmsg("logical decoding must be enabled on the primary")));
> >
> > Is the comment correct?
> >
> > a)
> > XLOG_LOGICAL_DECODING_STATUS_CHANGE can be result of logical-slot drop
> > on primary and not necessarily making wal_level < replica
> >
> > b)
> > I see that even standby does not allow to restart when changing
> > wal_level < replica as against what comment says. Have I understood
> > the intent correctly?
> >
> > standby LOG:
> > FATAL:  logical replication slot "failover_slot_st" exists, but
> > "wal_level" < "replica"
> > HINT:  Change "wal_level" to be "replica" or higher.
>
> I think I don't get your point. The comment looks correct to me.
>
> On a primary server, a logical slot can only decode WAL records that
> were generated while that slot existed. A
> XLOG_LOGICAL_DECODING_STATUS_CHANGE record with logical_decoding=false
> is generated in two cases: after the last logical slot is dropped, or
> when the server starts with no logical slot and wal_level<='replica'.
> In either case, no logical slots can exist that would be able to
> decode these WAL records. However, on standby servers, it is possible
> to decode these XLOG_LOGICAL_DECODING_STATUS_CHANGE records with
> logical_decoding=false, as standbys can decode WAL records
> independently of the primary.
>
> >
> >
> > 4)
> > start_logical_decoding_status_change():
> > + if (LogicalDecodingCtl->transition_in_progress)
> > + {
> > + LWLockRelease(LogicalDecodingControlLock);
> >
> > read_logical_decoding_status_transition() takes care of checking
> > transition_in_progress, I think we missed to remove above from
> > start_logical_decoding_status_change().
>
> Fixed.
>
> >
> >
> > 5)
> > + /* Return if we don't need to change the status */
> > + if (LogicalDecodingCtl->logical_decoding_enabled == new_status)
> > + {
> >
> > Same with this code-logic in start_logical_decoding_status_change(),
> > we shall remove it.
>
> Fixed.
>
> >
> > 6)
> > + * If we're in recovery and the startup process is still taking
> > + * responsibility to update the status, we cannot change.
> > + */
> > + if (!delay_status_change)
> > + return false;
> > +
> >
> > This comment is confusing as when in recovery, we can not change state
> > otherwise as well even if delay_status_change is false. IIUC, the
> > scenario can arise only during promotion, if so, shall we say:
> >
> > "If we're in recovery and a state transition (e.g., promotion) is in
> > progress, wait for the transition to complete and retry on the new
> > primary. Otherwise, disallow the status change entirely, as a standby
> > cannot modify the logical decoding status."
>
> Fixed.
>
> >
> > 7)
> > The name 'delay_status_change' does not indicate which status or the
> > intent of delay. More name options are: defer_logical_status_change,
> > wait_for_recovery_transition/completion,
> > recovery_transition_in_progress
>
> I think 'delay' is used in other similar examples in PostgreSQL code.
> For instance, we have DELAY_CHKPT_START/COMPLETE/IN_COMMIT that are
> set by transactions to delay the actual checkpoint process until these
> transactions complete certain operations. In our case, the flag is set
> by the startup process in order to delay the actual status change
> process by other processes until the recovery completes. Which is a
> very similar usage so I believe 'delay' is appropriate here.
>
> Regarding the 'status', I guess it's relatively obvious in this
> context that the status indicates the logical decoding status so I'm
> not sure that readers would confuse this name.
>
> >
> > 8)
> > DisableLogicalDecodingIfNecessary():
> > +
> > + /* Write the WAL to disable logical decoding on standbys too */
> > + if (XLogStandbyInfoActive() && !recoveryInProgress)
> > + {
> >
> > Do we need 'recoveryInProgress' check here?
> > start_logical_decoding_status_change() has taken care of that.
>
> Removed.
>
> >
> > 9)
> > Comments atop DisableLogicalDecodingIfNecessary:
> >
> >  * This function expects to be called after dropping a possibly-last logical
> >  * replication slot. Logical decoding can be disabled only when wal_level is set
> >  * to 'replica' and there is no logical replication slot on the system.
> >
> > The comment is not completely true, shall we amend the comment to say
> > something like:
> >
> > This function is called after a logical slot is dropped, but it only
> > disables logical decoding on primary if it was the last remaining
> > logical slot and wal_level < logical. Otherwise, it performs no
> > action.
>
> Thank you for the suggestion. I modified the comment based on the suggestion
>
> >
> > 10)
> > When we try to create or drop a logical slot on standby, and if
> > delay_status_change is false, shall we immediately exit? Currently it
> > does a lot of checks including CheckLogicalSlotExists() which can be
> > completely avoided. I think it is worth having a quick
> > 'RecoveryInProgress() && !delay_status_change' check in the beginning.
>
> Yeah, we can simplify the start_logical_decoding_status_change() logic more.
>
> I've attached the updated patch.
>

Hi Sawada-san,

I have reviewed the patch and have few comments:

1. There are some spelling mistakes in logicaldecoding.sgml
+     and requires waiting for any concurrent transactions to finish, ensureing
+     system-wide conistency. Conversely, when the last logical replication slot
ensureing -> ensuring
conistency -> consistency

2. In publicationcmds.c:
+                errmsg("logical decoding needs to be enabled to
publish logical changes"),
+                errhint("Set \"wal_level\" >= \"logical\" or create a
logical replication slot with \"replica\" WAL level before creating
subscriptions.")));

Should we use something like this for errhint ?
errhint("Set \"wal_level\" >= \"logical\" or create a logical
replication slot before creating subscriptions when \"wal_level\" =
\"replica\".")));

3. In logical.c:
+                errmsg("logical decoding needs to be enabled on the primary"),
+                errhint("Set \"wal_level\" >= \"logical\" or create
at least one logical slot with \"replica\" WAL level on the
primary.")));

Should we change the errhint message as below?
errhint("Set \"wal_level\" >= \"logical\" or create at least one
logical slot on the primary when \"wal_level\" = \"replica\".")));

4. In slotsync.c:
+               errmsg("replication slot synchronization requires
logical decoding to be enabled"),
+               errhint("Set \"wal_level\" >= \"logical\" or create at
least one logical slot with \"replica\" WAL level on the primary "));

Should we change the errhint message as below?
errhint("Set \"wal_level\" >= \"logical\" or create at least one
logical slot on the primary when \"wal_level\" = \"replica\".")));

---------

I have also tested the patch with creating multiple permanent/
temporary slots in concurrent sessions and I did not find any issue. I
also tested this patch with a physical replication setup.
I have a doubt in this case:
1. Suppose we have a physical replication setup. wal_level on primary
is set to 'replica'
2. Now we try to create a logical slot on standby, an error is thrown
as wal_level is set to 'replica' as primary does not have any logical
slot
3. Now we create a temporary logical slot on primary,
effective_wal_level is set to logical.
4. Now we can create slots on standby as effective_wal_level is logical.
5. Now we exit the session of the primary server. The temporary slot
is dropped. This will invalidate the slots on standby as the
effective_wal_level will be set to 'replica'.
So we can say that indirectly a temporary slot on primary can control
the behaviour of permanent slots on standbys.

I checked this behaviour in HEAD. In HEAD also the behaviour is the
same. If we change the wal_level on primary from 'logical' to
'replica', all slots are invalidated on the standby.
With patch this behaviour can be indirectly controlled by a temporary
slot. Is it fine? Thoughts?

Thanks,
Shlok Kyal



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Tue, Aug 12, 2025 at 1:26 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
>
>
> Hi Sawada-san,
>
> I have reviewed the patch and have few comments:

Thank  you for reviewing the patch!

>
> 1. There are some spelling mistakes in logicaldecoding.sgml
> +     and requires waiting for any concurrent transactions to finish, ensureing
> +     system-wide conistency. Conversely, when the last logical replication slot
> ensureing -> ensuring
> conistency -> consistency

Will fix.

>
> 2. In publicationcmds.c:
> +                errmsg("logical decoding needs to be enabled to
> publish logical changes"),
> +                errhint("Set \"wal_level\" >= \"logical\" or create a
> logical replication slot with \"replica\" WAL level before creating
> subscriptions.")));
>
> Should we use something like this for errhint ?
> errhint("Set \"wal_level\" >= \"logical\" or create a logical
> replication slot before creating subscriptions when \"wal_level\" =
> \"replica\".")));

The current message is:

Set "wal_level" >= "logical" or create a logical replication slot with
"replica" WAL level before creating subscriptions.

whereas the proposed message is:

Set "wal_level" >= "logical" or create a logical replication slot
before creating subscriptions when "wal_level" = "replica".

I don't see a big difference between the two sentences but your point
is to use 'when "wal_level" = "replica"' instead of 'with "replica"
WAL level'?

>
>
> 3. In logical.c:
> +                errmsg("logical decoding needs to be enabled on the primary"),
> +                errhint("Set \"wal_level\" >= \"logical\" or create
> at least one logical slot with \"replica\" WAL level on the
> primary.")));
>
> Should we change the errhint message as below?
> errhint("Set \"wal_level\" >= \"logical\" or create at least one
> logical slot on the primary when \"wal_level\" = \"replica\".")));
>
> 4. In slotsync.c:
> +               errmsg("replication slot synchronization requires
> logical decoding to be enabled"),
> +               errhint("Set \"wal_level\" >= \"logical\" or create at
> least one logical slot with \"replica\" WAL level on the primary "));
>
> Should we change the errhint message as below?
> errhint("Set \"wal_level\" >= \"logical\" or create at least one
> logical slot on the primary when \"wal_level\" = \"replica\".")));

Please see the above question.

>
> ---------
>
> I have also tested the patch with creating multiple permanent/
> temporary slots in concurrent sessions and I did not find any issue. I
> also tested this patch with a physical replication setup.
> I have a doubt in this case:
> 1. Suppose we have a physical replication setup. wal_level on primary
> is set to 'replica'
> 2. Now we try to create a logical slot on standby, an error is thrown
> as wal_level is set to 'replica' as primary does not have any logical
> slot
> 3. Now we create a temporary logical slot on primary,
> effective_wal_level is set to logical.
> 4. Now we can create slots on standby as effective_wal_level is logical.
> 5. Now we exit the session of the primary server. The temporary slot
> is dropped. This will invalidate the slots on standby as the
> effective_wal_level will be set to 'replica'.
> So we can say that indirectly a temporary slot on primary can control
> the behaviour of permanent slots on standbys.
>
> I checked this behaviour in HEAD. In HEAD also the behaviour is the
> same. If we change the wal_level on primary from 'logical' to
> 'replica', all slots are invalidated on the standby.
> With patch this behaviour can be indirectly controlled by a temporary
> slot. Is it fine? Thoughts?

Your understanding is correct. I've discussed whether we need a way to
keep auto-increased 'logical' WAL level on the primary when standbys
have logical slots. You mentioned temporary logical slots cases but I
think the same is true for the case where users accidentally drop the
last logical slot.

My understanding is that it's fine that logical decoding availability
on standbys is controlled by primary's logical slots (including temp
slots) presence. This essentially is the same behavior as the current
one and users who are concerned about indirectly invalidating the
logical slots on standbys can set wal_level to 'logical' on the
primary. It's a separate discussion (and patch) whether we need to
provide a way for users to keep auto-increased 'logical' WAL level on
the primary.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
shveta malik
Дата:
Please find a few comments:

1)
ReplicationSlotsDropDBSlots:
+ bool dropped = false;

We can name 'dropped ' as 'dropped_logical' similar to ReplicationSlotCleanup.

2)
ReplicationSlotsDropDBSlots()
+
+ if (dropped && nlogicalslots == 0)
+ DisableLogicalDecodingIfNecessary();

I could not understand the need of 'nlogicalslots' condition here?
Once we increment 'nlogicalslots', there is no way we can skip the
loop without dropping the slot with the only exception of ERROR-ing
out if active_pid is non NULL. So if the loop has completed and we
have reached this sage, won't it essentially mean 'nlogicalslots' is 0
in both cases: a) we actually dropped any slot;  b) we did not find
any slot to drop.  Or am I missing something?

Same is the case with ReplicationSlotCleanup().

3)
Few typos:

+ /*
+ * Update shmem flags. We don't need to care about the order of setting
+ * global flag and writing the WAL record this case since writes are not
+ * allowed yet.
+ */

this case --> in  this case

+ * This is the authoritative value used by the all process to determine

'used by all the processes'


049_effective_wal_level.pl:
4)

Few typos:
+# Initialize standby2 ndoe form the backup 'my_backup'.

ndoe form --> node from

+# Test the race condition between the startup and logical decoding
statuc change.

statuc --> status

5)
+# Promote the standby2 node that has one logical slot. So the logical decoding
+# keeps enabled even after the promotion.
+$standby2->promote;
+test_wal_level($standby2, "replica|logical",
+ "effective_wal_level keeps 'logical' even after the promotion");
+$standby2->safe_psql('postgres',
+ qq[select pg_create_logical_replication_slot('standby2_slot2', 'pgoutput')]
+);
+$standby2->stop;

Do we need 'pg_create_logical_replication_slot' here?

6)

+test_wal_level($primary, "replica|replica",
+ "effective_wal_level got decreased to 'replica' on primary");
+test_wal_level($standby3, "logical|replica",
+ "effective_wal_level got decreased to 'replica' on standby");
+test_wal_level($cascade, "replica|replica",
+ "effective_wal_level got decreased to 'logical' on standby");
+

Last one should also say:  decreased to 'replica' (instead of logical)

thanks
Shveta



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Shlok Kyal
Дата:
On Fri, 15 Aug 2025 at 04:38, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, Aug 12, 2025 at 1:26 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
> >
> >
> > Hi Sawada-san,
> >
> > I have reviewed the patch and have few comments:
>
> Thank  you for reviewing the patch!
>
> >
> > 1. There are some spelling mistakes in logicaldecoding.sgml
> > +     and requires waiting for any concurrent transactions to finish, ensureing
> > +     system-wide conistency. Conversely, when the last logical replication slot
> > ensureing -> ensuring
> > conistency -> consistency
>
> Will fix.
>
> >
> > 2. In publicationcmds.c:
> > +                errmsg("logical decoding needs to be enabled to
> > publish logical changes"),
> > +                errhint("Set \"wal_level\" >= \"logical\" or create a
> > logical replication slot with \"replica\" WAL level before creating
> > subscriptions.")));
> >
> > Should we use something like this for errhint ?
> > errhint("Set \"wal_level\" >= \"logical\" or create a logical
> > replication slot before creating subscriptions when \"wal_level\" =
> > \"replica\".")));
>
> The current message is:
>
> Set "wal_level" >= "logical" or create a logical replication slot with
> "replica" WAL level before creating subscriptions.
>
> whereas the proposed message is:
>
> Set "wal_level" >= "logical" or create a logical replication slot
> before creating subscriptions when "wal_level" = "replica".
>
> I don't see a big difference between the two sentences but your point
> is to use 'when "wal_level" = "replica"' instead of 'with "replica"
> WAL level'?
>
Yes. After reading the error messages again, I also think there is no
big difference and I am fine with both.

> >
> >
> > 3. In logical.c:
> > +                errmsg("logical decoding needs to be enabled on the primary"),
> > +                errhint("Set \"wal_level\" >= \"logical\" or create
> > at least one logical slot with \"replica\" WAL level on the
> > primary.")));
> >
> > Should we change the errhint message as below?
> > errhint("Set \"wal_level\" >= \"logical\" or create at least one
> > logical slot on the primary when \"wal_level\" = \"replica\".")));
> >
> > 4. In slotsync.c:
> > +               errmsg("replication slot synchronization requires
> > logical decoding to be enabled"),
> > +               errhint("Set \"wal_level\" >= \"logical\" or create at
> > least one logical slot with \"replica\" WAL level on the primary "));
> >
> > Should we change the errhint message as below?
> > errhint("Set \"wal_level\" >= \"logical\" or create at least one
> > logical slot on the primary when \"wal_level\" = \"replica\".")));
>
> Please see the above question.
>
> >
> > ---------
> >
> > I have also tested the patch with creating multiple permanent/
> > temporary slots in concurrent sessions and I did not find any issue. I
> > also tested this patch with a physical replication setup.
> > I have a doubt in this case:
> > 1. Suppose we have a physical replication setup. wal_level on primary
> > is set to 'replica'
> > 2. Now we try to create a logical slot on standby, an error is thrown
> > as wal_level is set to 'replica' as primary does not have any logical
> > slot
> > 3. Now we create a temporary logical slot on primary,
> > effective_wal_level is set to logical.
> > 4. Now we can create slots on standby as effective_wal_level is logical.
> > 5. Now we exit the session of the primary server. The temporary slot
> > is dropped. This will invalidate the slots on standby as the
> > effective_wal_level will be set to 'replica'.
> > So we can say that indirectly a temporary slot on primary can control
> > the behaviour of permanent slots on standbys.
> >
> > I checked this behaviour in HEAD. In HEAD also the behaviour is the
> > same. If we change the wal_level on primary from 'logical' to
> > 'replica', all slots are invalidated on the standby.
> > With patch this behaviour can be indirectly controlled by a temporary
> > slot. Is it fine? Thoughts?
>
> Your understanding is correct. I've discussed whether we need a way to
> keep auto-increased 'logical' WAL level on the primary when standbys
> have logical slots. You mentioned temporary logical slots cases but I
> think the same is true for the case where users accidentally drop the
> last logical slot.
>
> My understanding is that it's fine that logical decoding availability
> on standbys is controlled by primary's logical slots (including temp
> slots) presence. This essentially is the same behavior as the current
> one and users who are concerned about indirectly invalidating the
> logical slots on standbys can set wal_level to 'logical' on the
> primary. It's a separate discussion (and patch) whether we need to
> provide a way for users to keep auto-increased 'logical' WAL level on
> the primary.
>
Thanks for the explanation. I agree with you.

Also,
I did some testing with pg_createsubscriber and it is working fine. I
have some more comments:

1. in 040_pg_createsubscriber.pl we have:
# Check some unmet conditions on node P
$node_p->append_conf(
    'postgresql.conf', q{
wal_level = replica
max_replication_slots = 1
max_wal_senders = 1
max_worker_processes = 2
});
Comment says: "Check some unmet conditions on node P". But with this
patch "wal_level =  replica", will be a valid configuration, so it
will be contradictory to comment. Should we remove "wal_level =
replica" from the append_conf?

2. If we plan to change the above then we should also remove
"wal_level = logical" in the following:
# Restore default settings here but only apply it after testing standby. Some
# standby settings should not be a lower setting than on the primary.
$node_p->append_conf(
    'postgresql.conf', q{
wal_level = logical
max_replication_slots = 10
max_wal_senders = 10
max_worker_processes = 8
});


Thanks,
Shlok Kyal



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Wed, Aug 20, 2025 at 3:11 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> Please find a few comments:

Thank you for reviewing the patch!

>
> 1)
> ReplicationSlotsDropDBSlots:
> + bool dropped = false;
>
> We can name 'dropped ' as 'dropped_logical' similar to ReplicationSlotCleanup.

I think we don't necessarily need to add 'logical' because this
function attempts to drop only logical slots unlike
ReplicationSlotCleanup().

>
> 2)
> ReplicationSlotsDropDBSlots()
> +
> + if (dropped && nlogicalslots == 0)
> + DisableLogicalDecodingIfNecessary();
>
> I could not understand the need of 'nlogicalslots' condition here?
> Once we increment 'nlogicalslots', there is no way we can skip the
> loop without dropping the slot with the only exception of ERROR-ing
> out if active_pid is non NULL. So if the loop has completed and we
> have reached this sage, won't it essentially mean 'nlogicalslots' is 0
> in both cases: a) we actually dropped any slot;  b) we did not find
> any slot to drop.  Or am I missing something?

I think I should have incremented nlogicalslots even for logical slots
on other databases. What I want to do here is to call
DisableLogicalDecodingIfNecessary() only when we have dropped at least
one logical slots and there is no logical slots on the whole database
cluster as a result. If we have logical slots only on the current
database, we eventually reach the above 'if' statement with
dropped=true and nlogicalslots=0. On the other hand, if we have
logical slots also on other databases, we reach there with
dropped=true and nlogicalslots>0, meaning we don't want to disable
logical decoding. Does it make sense?

>
> Same is the case with ReplicationSlotCleanup().
>
> 3)
> Few typos:
>
> + /*
> + * Update shmem flags. We don't need to care about the order of setting
> + * global flag and writing the WAL record this case since writes are not
> + * allowed yet.
> + */
>
> this case --> in  this case
>
> + * This is the authoritative value used by the all process to determine
>
> 'used by all the processes'

Fixed.

> 049_effective_wal_level.pl:
> 4)
>
> Few typos:
> +# Initialize standby2 ndoe form the backup 'my_backup'.
>
> ndoe form --> node from
>
> +# Test the race condition between the startup and logical decoding
> statuc change.
>
> statuc --> status

Fixed.

>
> 5)
> +# Promote the standby2 node that has one logical slot. So the logical decoding
> +# keeps enabled even after the promotion.
> +$standby2->promote;
> +test_wal_level($standby2, "replica|logical",
> + "effective_wal_level keeps 'logical' even after the promotion");
> +$standby2->safe_psql('postgres',
> + qq[select pg_create_logical_replication_slot('standby2_slot2', 'pgoutput')]
> +);
> +$standby2->stop;
>
> Do we need 'pg_create_logical_replication_slot' here?

Yes, I put it to check if we can create a logical slot even after the
promotion. I've added the comment to explain it.

>
> 6)
>
> +test_wal_level($primary, "replica|replica",
> + "effective_wal_level got decreased to 'replica' on primary");
> +test_wal_level($standby3, "logical|replica",
> + "effective_wal_level got decreased to 'replica' on standby");
> +test_wal_level($cascade, "replica|replica",
> + "effective_wal_level got decreased to 'logical' on standby");
> +
>
> Last one should also say:  decreased to 'replica' (instead of logical)

Fixed.

I've attached the updated patch.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Вложения

Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Thu, Aug 21, 2025 at 3:50 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
>
> On Fri, 15 Aug 2025 at 04:38, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Tue, Aug 12, 2025 at 1:26 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
> > >
> > >
> > > Hi Sawada-san,
> > >
> > > I have reviewed the patch and have few comments:
> >
> > Thank  you for reviewing the patch!
> >
> > >
> > > 1. There are some spelling mistakes in logicaldecoding.sgml
> > > +     and requires waiting for any concurrent transactions to finish, ensureing
> > > +     system-wide conistency. Conversely, when the last logical replication slot
> > > ensureing -> ensuring
> > > conistency -> consistency
> >
> > Will fix.
> >
> > >
> > > 2. In publicationcmds.c:
> > > +                errmsg("logical decoding needs to be enabled to
> > > publish logical changes"),
> > > +                errhint("Set \"wal_level\" >= \"logical\" or create a
> > > logical replication slot with \"replica\" WAL level before creating
> > > subscriptions.")));
> > >
> > > Should we use something like this for errhint ?
> > > errhint("Set \"wal_level\" >= \"logical\" or create a logical
> > > replication slot before creating subscriptions when \"wal_level\" =
> > > \"replica\".")));
> >
> > The current message is:
> >
> > Set "wal_level" >= "logical" or create a logical replication slot with
> > "replica" WAL level before creating subscriptions.
> >
> > whereas the proposed message is:
> >
> > Set "wal_level" >= "logical" or create a logical replication slot
> > before creating subscriptions when "wal_level" = "replica".
> >
> > I don't see a big difference between the two sentences but your point
> > is to use 'when "wal_level" = "replica"' instead of 'with "replica"
> > WAL level'?
> >
> Yes. After reading the error messages again, I also think there is no
> big difference and I am fine with both.
>
> > >
> > >
> > > 3. In logical.c:
> > > +                errmsg("logical decoding needs to be enabled on the primary"),
> > > +                errhint("Set \"wal_level\" >= \"logical\" or create
> > > at least one logical slot with \"replica\" WAL level on the
> > > primary.")));
> > >
> > > Should we change the errhint message as below?
> > > errhint("Set \"wal_level\" >= \"logical\" or create at least one
> > > logical slot on the primary when \"wal_level\" = \"replica\".")));
> > >
> > > 4. In slotsync.c:
> > > +               errmsg("replication slot synchronization requires
> > > logical decoding to be enabled"),
> > > +               errhint("Set \"wal_level\" >= \"logical\" or create at
> > > least one logical slot with \"replica\" WAL level on the primary "));
> > >
> > > Should we change the errhint message as below?
> > > errhint("Set \"wal_level\" >= \"logical\" or create at least one
> > > logical slot on the primary when \"wal_level\" = \"replica\".")));
> >
> > Please see the above question.
> >
> > >
> > > ---------
> > >
> > > I have also tested the patch with creating multiple permanent/
> > > temporary slots in concurrent sessions and I did not find any issue. I
> > > also tested this patch with a physical replication setup.
> > > I have a doubt in this case:
> > > 1. Suppose we have a physical replication setup. wal_level on primary
> > > is set to 'replica'
> > > 2. Now we try to create a logical slot on standby, an error is thrown
> > > as wal_level is set to 'replica' as primary does not have any logical
> > > slot
> > > 3. Now we create a temporary logical slot on primary,
> > > effective_wal_level is set to logical.
> > > 4. Now we can create slots on standby as effective_wal_level is logical.
> > > 5. Now we exit the session of the primary server. The temporary slot
> > > is dropped. This will invalidate the slots on standby as the
> > > effective_wal_level will be set to 'replica'.
> > > So we can say that indirectly a temporary slot on primary can control
> > > the behaviour of permanent slots on standbys.
> > >
> > > I checked this behaviour in HEAD. In HEAD also the behaviour is the
> > > same. If we change the wal_level on primary from 'logical' to
> > > 'replica', all slots are invalidated on the standby.
> > > With patch this behaviour can be indirectly controlled by a temporary
> > > slot. Is it fine? Thoughts?
> >
> > Your understanding is correct. I've discussed whether we need a way to
> > keep auto-increased 'logical' WAL level on the primary when standbys
> > have logical slots. You mentioned temporary logical slots cases but I
> > think the same is true for the case where users accidentally drop the
> > last logical slot.
> >
> > My understanding is that it's fine that logical decoding availability
> > on standbys is controlled by primary's logical slots (including temp
> > slots) presence. This essentially is the same behavior as the current
> > one and users who are concerned about indirectly invalidating the
> > logical slots on standbys can set wal_level to 'logical' on the
> > primary. It's a separate discussion (and patch) whether we need to
> > provide a way for users to keep auto-increased 'logical' WAL level on
> > the primary.
> >
> Thanks for the explanation. I agree with you.
>
> Also,
> I did some testing with pg_createsubscriber and it is working fine. I
> have some more comments:
>
> 1. in 040_pg_createsubscriber.pl we have:
> # Check some unmet conditions on node P
> $node_p->append_conf(
>     'postgresql.conf', q{
> wal_level = replica
> max_replication_slots = 1
> max_wal_senders = 1
> max_worker_processes = 2
> });
> Comment says: "Check some unmet conditions on node P". But with this
> patch "wal_level =  replica", will be a valid configuration, so it
> will be contradictory to comment. Should we remove "wal_level =
> replica" from the append_conf?
>
> 2. If we plan to change the above then we should also remove
> "wal_level = logical" in the following:
> # Restore default settings here but only apply it after testing standby. Some
> # standby settings should not be a lower setting than on the primary.
> $node_p->append_conf(
>     'postgresql.conf', q{
> wal_level = logical
> max_replication_slots = 10
> max_wal_senders = 10
> max_worker_processes = 8
> });

Thank you for testing and reviewing the patch! I agree with above
comments so I incorporated them into the latest version patch I've
just submitted[1].

Regards,

[1] https://www.postgresql.org/message-id/CAD21AoBNCf_Yr%3Db7FbVpMPS4Vt6x-uqcLT3ELtATRFB9jUC3QQ%40mail.gmail.com

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
shveta malik
Дата:
On Thu, Aug 21, 2025 at 10:34 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Aug 20, 2025 at 3:11 AM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > Please find a few comments:
>
> Thank you for reviewing the patch!
>
> >
> > 1)
> > ReplicationSlotsDropDBSlots:
> > + bool dropped = false;
> >
> > We can name 'dropped ' as 'dropped_logical' similar to ReplicationSlotCleanup.
>
> I think we don't necessarily need to add 'logical' because this
> function attempts to drop only logical slots unlike
> ReplicationSlotCleanup().

Okay, I see. I missed that point earlier.

>
> >
> > 2)
> > ReplicationSlotsDropDBSlots()
> > +
> > + if (dropped && nlogicalslots == 0)
> > + DisableLogicalDecodingIfNecessary();
> >
> > I could not understand the need of 'nlogicalslots' condition here?
> > Once we increment 'nlogicalslots', there is no way we can skip the
> > loop without dropping the slot with the only exception of ERROR-ing
> > out if active_pid is non NULL. So if the loop has completed and we
> > have reached this sage, won't it essentially mean 'nlogicalslots' is 0
> > in both cases: a) we actually dropped any slot;  b) we did not find
> > any slot to drop.  Or am I missing something?
>
> I think I should have incremented nlogicalslots even for logical slots
> on other databases. What I want to do here is to call
> DisableLogicalDecodingIfNecessary() only when we have dropped at least
> one logical slots and there is no logical slots on the whole database
> cluster as a result. If we have logical slots only on the current
> database, we eventually reach the above 'if' statement with
> dropped=true and nlogicalslots=0. On the other hand, if we have
> logical slots also on other databases, we reach there with
> dropped=true and nlogicalslots>0, meaning we don't want to disable
> logical decoding. Does it make sense?
>

Yes, it makes sense after incrementing 'nlogicalslots' even for other databases.

> >
> > Same is the case with ReplicationSlotCleanup().
> >
> > 3)
> > Few typos:
> >
> > + /*
> > + * Update shmem flags. We don't need to care about the order of setting
> > + * global flag and writing the WAL record this case since writes are not
> > + * allowed yet.
> > + */
> >
> > this case --> in  this case
> >
> > + * This is the authoritative value used by the all process to determine
> >
> > 'used by all the processes'
>
> Fixed.
>
> > 049_effective_wal_level.pl:
> > 4)
> >
> > Few typos:
> > +# Initialize standby2 ndoe form the backup 'my_backup'.
> >
> > ndoe form --> node from
> >
> > +# Test the race condition between the startup and logical decoding
> > statuc change.
> >
> > statuc --> status
>
> Fixed.
>
> >
> > 5)
> > +# Promote the standby2 node that has one logical slot. So the logical decoding
> > +# keeps enabled even after the promotion.
> > +$standby2->promote;
> > +test_wal_level($standby2, "replica|logical",
> > + "effective_wal_level keeps 'logical' even after the promotion");
> > +$standby2->safe_psql('postgres',
> > + qq[select pg_create_logical_replication_slot('standby2_slot2', 'pgoutput')]
> > +);
> > +$standby2->stop;
> >
> > Do we need 'pg_create_logical_replication_slot' here?
>
> Yes, I put it to check if we can create a logical slot even after the
> promotion. I've added the comment to explain it.
>

Okay, makes sense.

> >
> > 6)
> >
> > +test_wal_level($primary, "replica|replica",
> > + "effective_wal_level got decreased to 'replica' on primary");
> > +test_wal_level($standby3, "logical|replica",
> > + "effective_wal_level got decreased to 'replica' on standby");
> > +test_wal_level($cascade, "replica|replica",
> > + "effective_wal_level got decreased to 'logical' on standby");
> > +
> >
> > Last one should also say:  decreased to 'replica' (instead of logical)
>
> Fixed.
>
> I've attached the updated patch.
>
> Regards,
>
> --
> Masahiko Sawada
> Amazon Web Services: https://aws.amazon.com



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Thu, Aug 21, 2025 at 8:11 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Thu, Aug 21, 2025 at 10:34 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Wed, Aug 20, 2025 at 3:11 AM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > Please find a few comments:
> >
> > Thank you for reviewing the patch!
> >
> > >
> > > 1)
> > > ReplicationSlotsDropDBSlots:
> > > + bool dropped = false;
> > >
> > > We can name 'dropped ' as 'dropped_logical' similar to ReplicationSlotCleanup.
> >
> > I think we don't necessarily need to add 'logical' because this
> > function attempts to drop only logical slots unlike
> > ReplicationSlotCleanup().
>
> Okay, I see. I missed that point earlier.
>
> >
> > >
> > > 2)
> > > ReplicationSlotsDropDBSlots()
> > > +
> > > + if (dropped && nlogicalslots == 0)
> > > + DisableLogicalDecodingIfNecessary();
> > >
> > > I could not understand the need of 'nlogicalslots' condition here?
> > > Once we increment 'nlogicalslots', there is no way we can skip the
> > > loop without dropping the slot with the only exception of ERROR-ing
> > > out if active_pid is non NULL. So if the loop has completed and we
> > > have reached this sage, won't it essentially mean 'nlogicalslots' is 0
> > > in both cases: a) we actually dropped any slot;  b) we did not find
> > > any slot to drop.  Or am I missing something?
> >
> > I think I should have incremented nlogicalslots even for logical slots
> > on other databases. What I want to do here is to call
> > DisableLogicalDecodingIfNecessary() only when we have dropped at least
> > one logical slots and there is no logical slots on the whole database
> > cluster as a result. If we have logical slots only on the current
> > database, we eventually reach the above 'if' statement with
> > dropped=true and nlogicalslots=0. On the other hand, if we have
> > logical slots also on other databases, we reach there with
> > dropped=true and nlogicalslots>0, meaning we don't want to disable
> > logical decoding. Does it make sense?
> >
>
> Yes, it makes sense after incrementing 'nlogicalslots' even for other databases.
>
> > >
> > > Same is the case with ReplicationSlotCleanup().
> > >
> > > 3)
> > > Few typos:
> > >
> > > + /*
> > > + * Update shmem flags. We don't need to care about the order of setting
> > > + * global flag and writing the WAL record this case since writes are not
> > > + * allowed yet.
> > > + */
> > >
> > > this case --> in  this case
> > >
> > > + * This is the authoritative value used by the all process to determine
> > >
> > > 'used by all the processes'
> >
> > Fixed.
> >
> > > 049_effective_wal_level.pl:
> > > 4)
> > >
> > > Few typos:
> > > +# Initialize standby2 ndoe form the backup 'my_backup'.
> > >
> > > ndoe form --> node from
> > >
> > > +# Test the race condition between the startup and logical decoding
> > > statuc change.
> > >
> > > statuc --> status
> >
> > Fixed.
> >
> > >
> > > 5)
> > > +# Promote the standby2 node that has one logical slot. So the logical decoding
> > > +# keeps enabled even after the promotion.
> > > +$standby2->promote;
> > > +test_wal_level($standby2, "replica|logical",
> > > + "effective_wal_level keeps 'logical' even after the promotion");
> > > +$standby2->safe_psql('postgres',
> > > + qq[select pg_create_logical_replication_slot('standby2_slot2', 'pgoutput')]
> > > +);
> > > +$standby2->stop;
> > >
> > > Do we need 'pg_create_logical_replication_slot' here?
> >
> > Yes, I put it to check if we can create a logical slot even after the
> > promotion. I've added the comment to explain it.
> >
>
> Okay, makes sense.
>
> > >
> > > 6)
> > >
> > > +test_wal_level($primary, "replica|replica",
> > > + "effective_wal_level got decreased to 'replica' on primary");
> > > +test_wal_level($standby3, "logical|replica",
> > > + "effective_wal_level got decreased to 'replica' on standby");
> > > +test_wal_level($cascade, "replica|replica",
> > > + "effective_wal_level got decreased to 'logical' on standby");
> > > +
> > >
> > > Last one should also say:  decreased to 'replica' (instead of logical)
> >
> > Fixed.
> >
> > I've attached the updated patch.

I found that we don't need to expose LogicalDecodingCtlData in
logicalctl.h header file. I've updated some cosmetic changes including
that point.

I think the patch is getting pretty good shape and am aiming at
getting this patch committed during the September commitfest. Is there
any further tests and verifications we need? Of course further patch
reviews are also welcome.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Вложения

Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
shveta malik
Дата:
On Sat, Aug 23, 2025 at 3:51 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
>
> I found that we don't need to expose LogicalDecodingCtlData in
> logicalctl.h header file. I've updated some cosmetic changes including
> that point.
>
> I think the patch is getting pretty good shape

Yes, I agree.

> and am aiming at
> getting this patch committed during the September commitfest.

Okay

> Is there
> any further tests and verifications we need? Of course further patch
> reviews are also welcome.

I'll spend more time reviewing and identifying the tests that are still pending.

thanks
Shveta



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Shlok Kyal
Дата:
On Sat, 23 Aug 2025 at 03:51, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Aug 21, 2025 at 8:11 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Thu, Aug 21, 2025 at 10:34 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Wed, Aug 20, 2025 at 3:11 AM shveta malik <shveta.malik@gmail.com> wrote:
> > > >
> > > > Please find a few comments:
> > >
> > > Thank you for reviewing the patch!
> > >
> > > >
> > > > 1)
> > > > ReplicationSlotsDropDBSlots:
> > > > + bool dropped = false;
> > > >
> > > > We can name 'dropped ' as 'dropped_logical' similar to ReplicationSlotCleanup.
> > >
> > > I think we don't necessarily need to add 'logical' because this
> > > function attempts to drop only logical slots unlike
> > > ReplicationSlotCleanup().
> >
> > Okay, I see. I missed that point earlier.
> >
> > >
> > > >
> > > > 2)
> > > > ReplicationSlotsDropDBSlots()
> > > > +
> > > > + if (dropped && nlogicalslots == 0)
> > > > + DisableLogicalDecodingIfNecessary();
> > > >
> > > > I could not understand the need of 'nlogicalslots' condition here?
> > > > Once we increment 'nlogicalslots', there is no way we can skip the
> > > > loop without dropping the slot with the only exception of ERROR-ing
> > > > out if active_pid is non NULL. So if the loop has completed and we
> > > > have reached this sage, won't it essentially mean 'nlogicalslots' is 0
> > > > in both cases: a) we actually dropped any slot;  b) we did not find
> > > > any slot to drop.  Or am I missing something?
> > >
> > > I think I should have incremented nlogicalslots even for logical slots
> > > on other databases. What I want to do here is to call
> > > DisableLogicalDecodingIfNecessary() only when we have dropped at least
> > > one logical slots and there is no logical slots on the whole database
> > > cluster as a result. If we have logical slots only on the current
> > > database, we eventually reach the above 'if' statement with
> > > dropped=true and nlogicalslots=0. On the other hand, if we have
> > > logical slots also on other databases, we reach there with
> > > dropped=true and nlogicalslots>0, meaning we don't want to disable
> > > logical decoding. Does it make sense?
> > >
> >
> > Yes, it makes sense after incrementing 'nlogicalslots' even for other databases.
> >
> > > >
> > > > Same is the case with ReplicationSlotCleanup().
> > > >
> > > > 3)
> > > > Few typos:
> > > >
> > > > + /*
> > > > + * Update shmem flags. We don't need to care about the order of setting
> > > > + * global flag and writing the WAL record this case since writes are not
> > > > + * allowed yet.
> > > > + */
> > > >
> > > > this case --> in  this case
> > > >
> > > > + * This is the authoritative value used by the all process to determine
> > > >
> > > > 'used by all the processes'
> > >
> > > Fixed.
> > >
> > > > 049_effective_wal_level.pl:
> > > > 4)
> > > >
> > > > Few typos:
> > > > +# Initialize standby2 ndoe form the backup 'my_backup'.
> > > >
> > > > ndoe form --> node from
> > > >
> > > > +# Test the race condition between the startup and logical decoding
> > > > statuc change.
> > > >
> > > > statuc --> status
> > >
> > > Fixed.
> > >
> > > >
> > > > 5)
> > > > +# Promote the standby2 node that has one logical slot. So the logical decoding
> > > > +# keeps enabled even after the promotion.
> > > > +$standby2->promote;
> > > > +test_wal_level($standby2, "replica|logical",
> > > > + "effective_wal_level keeps 'logical' even after the promotion");
> > > > +$standby2->safe_psql('postgres',
> > > > + qq[select pg_create_logical_replication_slot('standby2_slot2', 'pgoutput')]
> > > > +);
> > > > +$standby2->stop;
> > > >
> > > > Do we need 'pg_create_logical_replication_slot' here?
> > >
> > > Yes, I put it to check if we can create a logical slot even after the
> > > promotion. I've added the comment to explain it.
> > >
> >
> > Okay, makes sense.
> >
> > > >
> > > > 6)
> > > >
> > > > +test_wal_level($primary, "replica|replica",
> > > > + "effective_wal_level got decreased to 'replica' on primary");
> > > > +test_wal_level($standby3, "logical|replica",
> > > > + "effective_wal_level got decreased to 'replica' on standby");
> > > > +test_wal_level($cascade, "replica|replica",
> > > > + "effective_wal_level got decreased to 'logical' on standby");
> > > > +
> > > >
> > > > Last one should also say:  decreased to 'replica' (instead of logical)
> > >
> > > Fixed.
> > >
> > > I've attached the updated patch.
>
> I found that we don't need to expose LogicalDecodingCtlData in
> logicalctl.h header file. I've updated some cosmetic changes including
> that point.
>
> I think the patch is getting pretty good shape and am aiming at
> getting this patch committed during the September commitfest. Is there
> any further tests and verifications we need? Of course further patch
> reviews are also welcome.
>

Hi Sawada-san,

I reviewed the latest patch and have following comments:

1. In commit message, word 'slot' is missing:
When the first logical replication is created, the system
automatically increases the effective WAL level to maintain

Instead it should be:
When the first logical replication slot is created, ...

2. In slot.c:
+/*
+ * Returns if there is at least in-use logical replication slot.
+ */

Should we update it to:
Returns true if there is at least one in-use logical replication slot.

3. Due to recent commit [1], we cannot use "sync_replication_slots" =
on when wal_level < logical.
We get following error on standby:
2025-08-25 16:37:04.757 IST [2901542] FATAL:  replication slot
synchronization ("sync_replication_slots" = on) requires "wal_level"
>= "logical"
If we set the wal_level = logical on standby, then this error does not
appear and a slot sync worker is spawned.

With this patch, I think we can allow use of "sync_replication_slots"
= on when wal_level >= replica as standby will be dependent on
effective_wal_level on primary. Thoughts?
I also see that with patch, the use of pg_sync_replication_slots()
works with wal_level = replica.

[1]: https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=12da45742cfd15d9fab151b25400d96a1febcbde

Thanks,
Shlok Kyal



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Mon, Aug 25, 2025 at 6:02 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
>
>
> Hi Sawada-san,
>
> I reviewed the latest patch and have following comments:
>
> 1. In commit message, word 'slot' is missing:
> When the first logical replication is created, the system
> automatically increases the effective WAL level to maintain
>
> Instead it should be:
> When the first logical replication slot is created, ...

Fixed.

>
> 2. In slot.c:
> +/*
> + * Returns if there is at least in-use logical replication slot.
> + */
>
> Should we update it to:
> Returns true if there is at least one in-use logical replication slot.

Fixed.

>
> 3. Due to recent commit [1], we cannot use "sync_replication_slots" =
> on when wal_level < logical.
> We get following error on standby:
> 2025-08-25 16:37:04.757 IST [2901542] FATAL:  replication slot
> synchronization ("sync_replication_slots" = on) requires "wal_level"
> >= "logical"
> If we set the wal_level = logical on standby, then this error does not
> appear and a slot sync worker is spawned.
>
> With this patch, I think we can allow use of "sync_replication_slots"
> = on when wal_level >= replica as standby will be dependent on
> effective_wal_level on primary. Thoughts?
> I also see that with patch, the use of pg_sync_replication_slots()
> works with wal_level = replica.

Good point. I agree with you, so fixed.

I've attached the updated patch that incorporated the comments and is
rebased to the current HEAD.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Вложения

Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
shveta malik
Дата:
On Tue, Aug 26, 2025 at 12:54 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> I've attached the updated patch that incorporated the comments and is
> rebased to the current HEAD.
>

Thanks for the patch, please find a few comments concerning LOG messages:

1)
slotsync worker gives LOG:
LOG:  replication slot synchronization requires logical decoding to be enabled

By this LOG message, the user might not know how to enable logical
decoding. Shall we add HINT/DETAIL similar to other places:
To enable logical decoding on standby, set "wal_level" >= "logical" or
create at least one logical slot on the primary server.

2)
When we try to create a logical slot on standby, it takes some time
until runnign-txns are logged on primary. During that wait-time, if we
drop logical slot on primary disabling logical_deocding on standby,
then slot-creation fails with:

postgres=# SELECT pg_create_logical_replication_slot('st_slot2',
'pgoutput', false, false, false);
ERROR:  canceling statement due to conflict with recovery
DETAIL:  User was using a logical replication slot that must be invalidated.

Do we need to tweak the message a little bit as this new case is is
not the case of invalidation?

3)
When slot is invalidated on standby, we get message:

LOG:  invalidating obsolete replication slot "st_slot"
DETAIL:  Logical decoding on standby requires "wal_level" >= "logical"
or to create at least one logical slot on the primary server.

The DETAIL msg looks slightly odd. Shall we make it as:
Logical decoding on standby requires "wal_level" >= "logical" or at
least one logical slot on the primary server.

thanks
Shveta



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Tue, Aug 26, 2025 at 2:32 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Tue, Aug 26, 2025 at 12:54 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I've attached the updated patch that incorporated the comments and is
> > rebased to the current HEAD.
> >
>
> Thanks for the patch, please find a few comments concerning LOG messages:

Thank you for reviewing the patch!

>
> 1)
> slotsync worker gives LOG:
> LOG:  replication slot synchronization requires logical decoding to be enabled
>
> By this LOG message, the user might not know how to enable logical
> decoding. Shall we add HINT/DETAIL similar to other places:
> To enable logical decoding on standby, set "wal_level" >= "logical" or
> create at least one logical slot on the primary server.

Sounds good.

>
> 2)
> When we try to create a logical slot on standby, it takes some time
> until runnign-txns are logged on primary. During that wait-time, if we
> drop logical slot on primary disabling logical_deocding on standby,
> then slot-creation fails with:
>
> postgres=# SELECT pg_create_logical_replication_slot('st_slot2',
> 'pgoutput', false, false, false);
> ERROR:  canceling statement due to conflict with recovery
> DETAIL:  User was using a logical replication slot that must be invalidated.
>
> Do we need to tweak the message a little bit as this new case is is
> not the case of invalidation?

I think this is the case of invalidation but why do you think it's not?

>
> 3)
> When slot is invalidated on standby, we get message:
>
> LOG:  invalidating obsolete replication slot "st_slot"
> DETAIL:  Logical decoding on standby requires "wal_level" >= "logical"
> or to create at least one logical slot on the primary server.
>
> The DETAIL msg looks slightly odd. Shall we make it as:
> Logical decoding on standby requires "wal_level" >= "logical" or at
> least one logical slot on the primary server.

Agreed.

When testing the patch further, I found a bug in a race condition in
case of aborting the activation process. In the attached latest
version patch, I've fixed the bug and included the test case to the
new TAP test.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Вложения

RE: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
"Hayato Kuroda (Fujitsu)"
Дата:
Dear Sawada-san,

Thanks for updating the patch. Here are my comments.

xlog_desc()
```
    else if (info == XLOG_LOGICAL_DECODING_STATUS_CHANGE)
    {
        bool        enabled;

        memcpy(&enabled, rec, sizeof(bool));
        appendStringInfo(buf, enabled ? "true" : "false");
    }
```

Per 2075ba9, appendStringInfoString() can be used if we do not have other messages.

logicalctl.h
```
extern void UpdateNumberOfLogicalSlots(bool incr);
```

This function is not implemented.

UpdateLogicalDecodingStatus()
```
    elog(DEBUG1, "update logical decoding status to %d", new_status);
```

I prefer to use true/false instead of 1/0, thought?

xlog_redo()
```
        /* Update the status on shared memory */
        memcpy(&logical_decoding, XLogRecGetData(record), sizeof(bool));
        UpdateLogicalDecodingStatus(logical_decoding, true);

        if (InRecovery && InHotStandby)
        {
            if (!logical_decoding)
            {
                /*
                 * Invalidate logical slots if we are in hot standby and the
                 * primary disabled the logical decoding.
                 */
                InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_LEVEL,
                                                   0, InvalidOid,
                                                   InvalidTransactionId);

```

Assuming that logical_decoding written in the WAL is false here, and a logical
replication slot is created just after that. In my experiments below happened:

1. startup process updated logical_decoding_enabled to false, at line 8652.
2. slotsync worker started to sync. Surprisingly, it created a (second) logical
   slot and started logical decoding with fast_foward mode.
3. startup invalidated logical slots due to the wal_level. the slot created at
   step2 was automatically dropped, because it was not sync-readly yet.
4. startup process shut down the slotsync worker.
5. start process read the STATUS_CHANGE record again, which has the value "true".
   it requested to restart the sync worker.
6. restarted sync worker synchronize the slot again...

For me it works well but it is bit a strange because 1) logical decoding is
started even when effective_wal_level is false, and 2) the synced slot is
dropped once with below message:

```
LOG:  terminating process 1474448 to release replication slot "test2"
DETAIL:  Logical decoding on standby requires "wal_level" >= "logical" or at least one logical slot on the primary
server.
CONTEXT:  WAL redo at 0/030000B8 for XLOG/LOGICAL_DECODING_STATUS_CHANGE: false
ERROR:  canceling statement due to conflict with recovery
DETAIL:  User was using a logical replication slot that must be invalidated.
```

Can we stop the sync worker before updating the status? IIUC this is one of the
solution.

Best regards,
Hayato Kuroda
FUJITSU LIMITED


Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Wed, Aug 27, 2025 at 5:08 AM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Dear Sawada-san,
>
> Thanks for updating the patch. Here are my comments.

Thank you for reviewing the patch!

>
> xlog_desc()
> ```
>         else if (info == XLOG_LOGICAL_DECODING_STATUS_CHANGE)
>         {
>                 bool            enabled;
>
>                 memcpy(&enabled, rec, sizeof(bool));
>                 appendStringInfo(buf, enabled ? "true" : "false");
>         }
> ```
>
> Per 2075ba9, appendStringInfoString() can be used if we do not have other messages.

Agreed, will fix.

>
> logicalctl.h
> ```
> extern void UpdateNumberOfLogicalSlots(bool incr);
> ```
>
> This function is not implemented.

Removed.

>
> UpdateLogicalDecodingStatus()
> ```
>         elog(DEBUG1, "update logical decoding status to %d", new_status);
> ```
>
> I prefer to use true/false instead of 1/0, thought?

I think we don't necessarily need it as it's a debug log.

> xlog_redo()
> ```
>                 /* Update the status on shared memory */
>                 memcpy(&logical_decoding, XLogRecGetData(record), sizeof(bool));
>                 UpdateLogicalDecodingStatus(logical_decoding, true);
>
>                 if (InRecovery && InHotStandby)
>                 {
>                         if (!logical_decoding)
>                         {
>                                 /*
>                                  * Invalidate logical slots if we are in hot standby and the
>                                  * primary disabled the logical decoding.
>                                  */
>                                 InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_LEVEL,
>                                                                                                    0, InvalidOid,
>
InvalidTransactionId);
>
> ```
>
> Assuming that logical_decoding written in the WAL is false here, and a logical
> replication slot is created just after that. In my experiments below happened:
>

Let me clarify each step:

> 1. startup process updated logical_decoding_enabled to false, at line 8652.

I assume that logical_decoding_enabled was enabled before step 1.

> 2. slotsync worker started to sync. Surprisingly, it created a (second) logical
>    slot and started logical decoding with fast_foward mode.

I guess that the postmaster launched the slotsync worker before the
startup changes the status since logical decoding was enabled as I
mentioned above, which seems fine to me.

> 3. startup invalidated logical slots due to the wal_level. the slot created at
>    step2 was automatically dropped, because it was not sync-readly yet.
> 4. startup process shut down the slotsync worker.
> 5. start process read the STATUS_CHANGE record again, which has the value "true".
>    it requested to restart the sync worker.
> 6. restarted sync worker synchronize the slot again...
>
> For me it works well but it is bit a strange because 1) logical decoding is
> started even when effective_wal_level is false,

I think it's a race condition between the postmaster and the startup,
it could happen even between the backend and the startup; the startup
disables logical decoding right after the backend passes
CheckLogicalDecodingRequirements() check. I think it's technically
okay since all WAL records before the STATUS_CHANGE should have the
logical information. Even if it starts to do logical decoding, it
would end up decoding the STATUS_CHANGE record and with an error (see
xlog_decode()).

> and 2) the synced slot is
> dropped once with below message:
>
> ```
> LOG:  terminating process 1474448 to release replication slot "test2"
> DETAIL:  Logical decoding on standby requires "wal_level" >= "logical" or at least one logical slot on the primary
server.
> CONTEXT:  WAL redo at 0/030000B8 for XLOG/LOGICAL_DECODING_STATUS_CHANGE: false
> ERROR:  canceling statement due to conflict with recovery
> DETAIL:  User was using a logical replication slot that must be invalidated.
> ```
>
> Can we stop the sync worker before updating the status? IIUC this is one of the
> solution.

I think it would lead to another race condition; the slotsync worker
can start again before updating the status.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



RE: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
"Hayato Kuroda (Fujitsu)"
Дата:
Dear Sawada-san,

> > Assuming that logical_decoding written in the WAL is false here, and a logical
> > replication slot is created just after that. In my experiments below happened:
> >
> 
> Let me clarify each step:
> 
> > 1. startup process updated logical_decoding_enabled to false, at line 8652.
> 
> I assume that logical_decoding_enabled was enabled before step 1.

Right. Initially logical replication slot exist on both primary and standby.
More detail; the standby slot was created by the slotsync worker.

> > 2. slotsync worker started to sync. Surprisingly, it created a (second) logical
> >    slot and started logical decoding with fast_foward mode.
> 
> I guess that the postmaster launched the slotsync worker before the
> startup changes the status since logical decoding was enabled as I
> mentioned above, which seems fine to me.

As you said, the slotsync worker has already been launched when the status is
changed. I felt logical slot should not be created after the status on the shared
memory is changed.

> > 3. startup invalidated logical slots due to the wal_level. the slot created at
> >    step2 was automatically dropped, because it was not sync-readly yet.
> > 4. startup process shut down the slotsync worker.
> > 5. start process read the STATUS_CHANGE record again, which has the value
> "true".
> >    it requested to restart the sync worker.
> > 6. restarted sync worker synchronize the slot again...
> >
> > For me it works well but it is bit a strange because 1) logical decoding is
> > started even when effective_wal_level is false,
> 
> I think it's a race condition between the postmaster and the startup,
> it could happen even between the backend and the startup; the startup
> disables logical decoding right after the backend passes
> CheckLogicalDecodingRequirements() check. I think it's technically
> okay since all WAL records before the STATUS_CHANGE should have the
> logical information. Even if it starts to do logical decoding, it
> would end up decoding the STATUS_CHANGE record and with an error (see
> xlog_decode()).

To clarify, are you thinking that it is no need to be fixed, because eventually
the system becomes the appropriate state, right?

> > and 2) the synced slot is
> > dropped once with below message:
> >
> > ```
> > LOG:  terminating process 1474448 to release replication slot "test2"
> > DETAIL:  Logical decoding on standby requires "wal_level" >= "logical" or at
> least one logical slot on the primary server.
> > CONTEXT:  WAL redo at 0/030000B8 for
> XLOG/LOGICAL_DECODING_STATUS_CHANGE: false
> > ERROR:  canceling statement due to conflict with recovery
> > DETAIL:  User was using a logical replication slot that must be invalidated.
> > ```
> >
> > Can we stop the sync worker before updating the status? IIUC this is one of the
> > solution.
> 
> I think it would lead to another race condition; the slotsync worker
> can start again before updating the status.

Hmm, okay.

Another small comment: this data structure is not used in other files, no need to set extern.

```
extern LogicalDecodingCtlData *LogicalDecodingCtl;
```

Best regards,
Hayato Kuroda
FUJITSU LIMITED 


Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
shveta malik
Дата:
On Wed, Aug 27, 2025 at 12:12 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, Aug 26, 2025 at 2:32 AM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Tue, Aug 26, 2025 at 12:54 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > I've attached the updated patch that incorporated the comments and is
> > > rebased to the current HEAD.
> > >
> >
> > Thanks for the patch, please find a few comments concerning LOG messages:
>
> Thank you for reviewing the patch!
>
> >
> > 1)
> > slotsync worker gives LOG:
> > LOG:  replication slot synchronization requires logical decoding to be enabled
> >
> > By this LOG message, the user might not know how to enable logical
> > decoding. Shall we add HINT/DETAIL similar to other places:
> > To enable logical decoding on standby, set "wal_level" >= "logical" or
> > create at least one logical slot on the primary server.
>
> Sounds good.
>
> >
> > 2)
> > When we try to create a logical slot on standby, it takes some time
> > until runnign-txns are logged on primary. During that wait-time, if we
> > drop logical slot on primary disabling logical_deocding on standby,
> > then slot-creation fails with:
> >
> > postgres=# SELECT pg_create_logical_replication_slot('st_slot2',
> > 'pgoutput', false, false, false);
> > ERROR:  canceling statement due to conflict with recovery
> > DETAIL:  User was using a logical replication slot that must be invalidated.
> >
> > Do we need to tweak the message a little bit as this new case is is
> > not the case of invalidation?
>
> I think this is the case of invalidation but why do you think it's not?
>

Sorry, I did not get. Which slot got invalidated? Primary's slot was
dropped and standby's slot did not even finish creation. So, I am
confused with the detail-message.

> >
> > 3)
> > When slot is invalidated on standby, we get message:
> >
> > LOG:  invalidating obsolete replication slot "st_slot"
> > DETAIL:  Logical decoding on standby requires "wal_level" >= "logical"
> > or to create at least one logical slot on the primary server.
> >
> > The DETAIL msg looks slightly odd. Shall we make it as:
> > Logical decoding on standby requires "wal_level" >= "logical" or at
> > least one logical slot on the primary server.
>
> Agreed.
>
> When testing the patch further, I found a bug in a race condition in
> case of aborting the activation process. In the attached latest
> version patch, I've fixed the bug and included the test case to the
> new TAP test.
>
> Regards,
>
> --
> Masahiko Sawada
> Amazon Web Services: https://aws.amazon.com



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Wed, Aug 27, 2025 at 7:45 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Dear Sawada-san,
>
> > > Assuming that logical_decoding written in the WAL is false here, and a logical
> > > replication slot is created just after that. In my experiments below happened:
> > >
> >
> > Let me clarify each step:
> >
> > > 1. startup process updated logical_decoding_enabled to false, at line 8652.
> >
> > I assume that logical_decoding_enabled was enabled before step 1.
>
> Right. Initially logical replication slot exist on both primary and standby.
> More detail; the standby slot was created by the slotsync worker.
>
> > > 2. slotsync worker started to sync. Surprisingly, it created a (second) logical
> > >    slot and started logical decoding with fast_foward mode.
> >
> > I guess that the postmaster launched the slotsync worker before the
> > startup changes the status since logical decoding was enabled as I
> > mentioned above, which seems fine to me.
>
> As you said, the slotsync worker has already been launched when the status is
> changed. I felt logical slot() should not be created after the status on the shared
> memory is changed.
>
> > > 3. startup invalidated logical slots due to the wal_level. the slot created at
> > >    step2 was automatically dropped, because it was not sync-readly yet.
> > > 4. startup process shut down the slotsync worker.
> > > 5. start process read the STATUS_CHANGE record again, which has the value
> > "true".
> > >    it requested to restart the sync worker.
> > > 6. restarted sync worker synchronize the slot again...
> > >
> > > For me it works well but it is bit a strange because 1) logical decoding is
> > > started even when effective_wal_level is false,
> >
> > I think it's a race condition between the postmaster and the startup,
> > it could happen even between the backend and the startup; the startup
> > disables logical decoding right after the backend passes
> > CheckLogicalDecodingRequirements() check. I think it's technically
> > okay since all WAL records before the STATUS_CHANGE should have the
> > logical information. Even if it starts to do logical decoding, it
> > would end up decoding the STATUS_CHANGE record and with an error (see
> > xlog_decode()).

My understanding of where the synced slot starts to move was not
right; it starts from the remote slot's restart_lsn, which could be
far ahead from the STATUS_CHANGE record that the startup process is
applying but where logical decoding should be enabled. It doesn't
happen that the slotsync worker tries to decode non-logical WAL
records even if it advances the slot after the startup disabled
logical decoding.

> To clarify, are you thinking that it is no need to be fixed, because eventually
> the system becomes the appropriate state, right?

IIUC you're concerned it's possible that the slotsync worker creates
or advances a logical slot between the startup changes the logical
decoding status to false and sends the stop signal. TBH I have no idea
how efficiently to fix it. I've considered a simple idea that the
slotsync worker checks IsLogicalDecodingEnabled() before trying to
sync one logical slot. However, it doesn't solve the race condition;
the startup process can disable logical decoding right after the
slotsync passed the check, in which case users would see the logical
slot is created after logical decoding is disabled.

Another race condition that we might need to deal with is, the
slotsync worker is launched while logical decoding is still enabled,
but if the startup sends the stop signal to the slotsync worker before
the worker sets its pid to SlotSyncCtx->pid, the worker will keep
running. I've added the check !IsLogicalDecodingEnabled() to the
slotsync worker's initialization.

>
> > > and 2) the synced slot is
> > > dropped once with below message:
> > >
> > > ```
> > > LOG:  terminating process 1474448 to release replication slot "test2"
> > > DETAIL:  Logical decoding on standby requires "wal_level" >= "logical" or at
> > least one logical slot on the primary server.
> > > CONTEXT:  WAL redo at 0/030000B8 for
> > XLOG/LOGICAL_DECODING_STATUS_CHANGE: false
> > > ERROR:  canceling statement due to conflict with recovery
> > > DETAIL:  User was using a logical replication slot that must be invalidated.
> > > ```
> > >
> > > Can we stop the sync worker before updating the status? IIUC this is one of the
> > > solution.
> >
> > I think it would lead to another race condition; the slotsync worker
> > can start again before updating the status.
>
> Hmm, okay.
>
> Another small comment: this data structure is not used in other files, no need to set extern.
>
> ```
> extern LogicalDecodingCtlData *LogicalDecodingCtl;
> ```

Removed.

I've attached the updated patch.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Вложения

Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Thu, Aug 28, 2025 at 4:29 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Wed, Aug 27, 2025 at 12:12 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Tue, Aug 26, 2025 at 2:32 AM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > On Tue, Aug 26, 2025 at 12:54 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > I've attached the updated patch that incorporated the comments and is
> > > > rebased to the current HEAD.
> > > >
> > >
> > > Thanks for the patch, please find a few comments concerning LOG messages:
> >
> > Thank you for reviewing the patch!
> >
> > >
> > > 1)
> > > slotsync worker gives LOG:
> > > LOG:  replication slot synchronization requires logical decoding to be enabled
> > >
> > > By this LOG message, the user might not know how to enable logical
> > > decoding. Shall we add HINT/DETAIL similar to other places:
> > > To enable logical decoding on standby, set "wal_level" >= "logical" or
> > > create at least one logical slot on the primary server.
> >
> > Sounds good.
> >
> > >
> > > 2)
> > > When we try to create a logical slot on standby, it takes some time
> > > until runnign-txns are logged on primary. During that wait-time, if we
> > > drop logical slot on primary disabling logical_deocding on standby,
> > > then slot-creation fails with:
> > >
> > > postgres=# SELECT pg_create_logical_replication_slot('st_slot2',
> > > 'pgoutput', false, false, false);
> > > ERROR:  canceling statement due to conflict with recovery
> > > DETAIL:  User was using a logical replication slot that must be invalidated.
> > >
> > > Do we need to tweak the message a little bit as this new case is is
> > > not the case of invalidation?
> >
> > I think this is the case of invalidation but why do you think it's not?
> >
>
> Sorry, I did not get. Which slot got invalidated? Primary's slot was
> dropped and standby's slot did not even finish creation. So, I am
> confused with the detail-message.

Okay, so we would probably need to distinguish between "creating a
slot" and "using a slot"? Given that this scenario can happen even
today,  It might be worth considering improving the error detail
message but I think we should do that in a separate patch.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
shveta malik
Дата:
On Fri, Aug 29, 2025 at 9:59 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Aug 28, 2025 at 4:29 AM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Wed, Aug 27, 2025 at 12:12 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Tue, Aug 26, 2025 at 2:32 AM shveta malik <shveta.malik@gmail.com> wrote:
> > > >
> > > > On Tue, Aug 26, 2025 at 12:54 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > >
> > > > > I've attached the updated patch that incorporated the comments and is
> > > > > rebased to the current HEAD.
> > > > >
> > > >
> > > > Thanks for the patch, please find a few comments concerning LOG messages:
> > >
> > > Thank you for reviewing the patch!
> > >
> > > >
> > > > 1)
> > > > slotsync worker gives LOG:
> > > > LOG:  replication slot synchronization requires logical decoding to be enabled
> > > >
> > > > By this LOG message, the user might not know how to enable logical
> > > > decoding. Shall we add HINT/DETAIL similar to other places:
> > > > To enable logical decoding on standby, set "wal_level" >= "logical" or
> > > > create at least one logical slot on the primary server.
> > >
> > > Sounds good.
> > >
> > > >
> > > > 2)
> > > > When we try to create a logical slot on standby, it takes some time
> > > > until runnign-txns are logged on primary. During that wait-time, if we
> > > > drop logical slot on primary disabling logical_deocding on standby,
> > > > then slot-creation fails with:
> > > >
> > > > postgres=# SELECT pg_create_logical_replication_slot('st_slot2',
> > > > 'pgoutput', false, false, false);
> > > > ERROR:  canceling statement due to conflict with recovery
> > > > DETAIL:  User was using a logical replication slot that must be invalidated.
> > > >
> > > > Do we need to tweak the message a little bit as this new case is is
> > > > not the case of invalidation?
> > >
> > > I think this is the case of invalidation but why do you think it's not?
> > >
> >
> > Sorry, I did not get. Which slot got invalidated? Primary's slot was
> > dropped and standby's slot did not even finish creation. So, I am
> > confused with the detail-message.
>
> Okay, so we would probably need to distinguish between "creating a
> slot" and "using a slot"? Given that this scenario can happen even
> today,  It might be worth considering improving the error detail
> message but I think we should do that in a separate patch.
>

Okay, I initially thought that this message is the problem of current
patch. But I tested it without this patch. If create-slot is waiting
on standby, meanwhile we switch wal_level on primary to < logical and
restart the primary server; standby's slot creation eventually fails
with same error:

postgres=# SELECT pg_create_logical_replication_slot('sub22',
'pgoutput', false, false, false);
ERROR:  canceling statement due to conflict with recovery
DETAIL:  User was using a logical replication slot that must be invalidated.

So, yes we can consider fixing the message separately.

thanks
Shveta



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Shlok Kyal
Дата:
On Fri, 29 Aug 2025 at 09:38, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Aug 27, 2025 at 7:45 PM Hayato Kuroda (Fujitsu)
> <kuroda.hayato@fujitsu.com> wrote:
> >
> > Dear Sawada-san,
> >
> > > > Assuming that logical_decoding written in the WAL is false here, and a logical
> > > > replication slot is created just after that. In my experiments below happened:
> > > >
> > >
> > > Let me clarify each step:
> > >
> > > > 1. startup process updated logical_decoding_enabled to false, at line 8652.
> > >
> > > I assume that logical_decoding_enabled was enabled before step 1.
> >
> > Right. Initially logical replication slot exist on both primary and standby.
> > More detail; the standby slot was created by the slotsync worker.
> >
> > > > 2. slotsync worker started to sync. Surprisingly, it created a (second) logical
> > > >    slot and started logical decoding with fast_foward mode.
> > >
> > > I guess that the postmaster launched the slotsync worker before the
> > > startup changes the status since logical decoding was enabled as I
> > > mentioned above, which seems fine to me.
> >
> > As you said, the slotsync worker has already been launched when the status is
> > changed. I felt logical slot() should not be created after the status on the shared
> > memory is changed.
> >
> > > > 3. startup invalidated logical slots due to the wal_level. the slot created at
> > > >    step2 was automatically dropped, because it was not sync-readly yet.
> > > > 4. startup process shut down the slotsync worker.
> > > > 5. start process read the STATUS_CHANGE record again, which has the value
> > > "true".
> > > >    it requested to restart the sync worker.
> > > > 6. restarted sync worker synchronize the slot again...
> > > >
> > > > For me it works well but it is bit a strange because 1) logical decoding is
> > > > started even when effective_wal_level is false,
> > >
> > > I think it's a race condition between the postmaster and the startup,
> > > it could happen even between the backend and the startup; the startup
> > > disables logical decoding right after the backend passes
> > > CheckLogicalDecodingRequirements() check. I think it's technically
> > > okay since all WAL records before the STATUS_CHANGE should have the
> > > logical information. Even if it starts to do logical decoding, it
> > > would end up decoding the STATUS_CHANGE record and with an error (see
> > > xlog_decode()).
>
> My understanding of where the synced slot starts to move was not
> right; it starts from the remote slot's restart_lsn, which could be
> far ahead from the STATUS_CHANGE record that the startup process is
> applying but where logical decoding should be enabled. It doesn't
> happen that the slotsync worker tries to decode non-logical WAL
> records even if it advances the slot after the startup disabled
> logical decoding.
>
> > To clarify, are you thinking that it is no need to be fixed, because eventually
> > the system becomes the appropriate state, right?
>
> IIUC you're concerned it's possible that the slotsync worker creates
> or advances a logical slot between the startup changes the logical
> decoding status to false and sends the stop signal. TBH I have no idea
> how efficiently to fix it. I've considered a simple idea that the
> slotsync worker checks IsLogicalDecodingEnabled() before trying to
> sync one logical slot. However, it doesn't solve the race condition;
> the startup process can disable logical decoding right after the
> slotsync passed the check, in which case users would see the logical
> slot is created after logical decoding is disabled.
>
> Another race condition that we might need to deal with is, the
> slotsync worker is launched while logical decoding is still enabled,
> but if the startup sends the stop signal to the slotsync worker before
> the worker sets its pid to SlotSyncCtx->pid, the worker will keep
> running. I've added the check !IsLogicalDecodingEnabled() to the
> slotsync worker's initialization.
>
> >
> > > > and 2) the synced slot is
> > > > dropped once with below message:
> > > >
> > > > ```
> > > > LOG:  terminating process 1474448 to release replication slot "test2"
> > > > DETAIL:  Logical decoding on standby requires "wal_level" >= "logical" or at
> > > least one logical slot on the primary server.
> > > > CONTEXT:  WAL redo at 0/030000B8 for
> > > XLOG/LOGICAL_DECODING_STATUS_CHANGE: false
> > > > ERROR:  canceling statement due to conflict with recovery
> > > > DETAIL:  User was using a logical replication slot that must be invalidated.
> > > > ```
> > > >
> > > > Can we stop the sync worker before updating the status? IIUC this is one of the
> > > > solution.
> > >
> > > I think it would lead to another race condition; the slotsync worker
> > > can start again before updating the status.
> >
> > Hmm, okay.
> >
> > Another small comment: this data structure is not used in other files, no need to set extern.
> >
> > ```
> > extern LogicalDecodingCtlData *LogicalDecodingCtl;
> > ```
>
> Removed.
>
> I've attached the updated patch.
>
Hi Sawada-san,

Thanks for the updated patch.

I have a doubt. When we create publication (when wal_level is set to
replica) we get a warning:
WARNING:  logical decoding needs to be enabled to publish logical changes
HINT:  Before creating subscriptions, set "wal_level" >= "logical" or
create a logical replication slot when "wal_level" = "replica".

The hint suggests that when wal_level = 'replica', before creating a
subscription, we should create logical slots on the publisher. But
when I tested this scenario, I created a subscription (without having
a prior logical slot on the publisher). The operation was successful,
the effective_wal_level was set appropriately and logical replication
was working fine. I think this happens because the CREATE SUBSCRIPTION
command itself creates a logical slot on the publisher.

Should we update the HINT message here?

Thanks,
Shlok Kyal



RE: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
"Hayato Kuroda (Fujitsu)"
Дата:
Dear Sawada-san,

> My understanding of where the synced slot starts to move was not
> right; it starts from the remote slot's restart_lsn, which could be
> far ahead from the STATUS_CHANGE record that the startup process is
> applying but where logical decoding should be enabled. It doesn't
> happen that the slotsync worker tries to decode non-logical WAL
> records even if it advances the slot after the startup disabled
> logical decoding.

Let me confirm your point. If the situation, which the slot is dropped and then
created while the startup process processing, happens, the WAL records would be
aligned like below. Your point is that the restart_lsn of the created slot is
beginning of (b) so that all records can be decoded, right?

```
STATUS_CHANGE true
RUNNING_XACTS            // (a) - generated by the first slot
...
STATUS_CHANGE false        // due to the slot drop
...
STATUS_CHANGE true        // from here all records are decode-safe
RUNNING_XACTS            // (b) - generated by the second slot, restart_lsn can set here
```

> IIUC you're concerned it's possible that the slotsync worker creates
> or advances a logical slot between the startup changes the logical
> decoding status to false and sends the stop signal.

Right.

> how efficiently to fix it. I've considered a simple idea that the
> slotsync worker checks IsLogicalDecodingEnabled() before trying to
> sync one logical slot. However, it doesn't solve the race condition;
> the startup process can disable logical decoding right after the
> slotsync passed the check, in which case users would see the logical
> slot is created after logical decoding is disabled.

So... even if we can add check in decoding functions, the startup process can
disable the logical decoding after that, is it also right?

Best regards,
Hayato Kuroda
FUJITSU LIMITED


Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Fri, Aug 29, 2025 at 5:31 AM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Dear Sawada-san,
>
> > My understanding of where the synced slot starts to move was not
> > right; it starts from the remote slot's restart_lsn, which could be
> > far ahead from the STATUS_CHANGE record that the startup process is
> > applying but where logical decoding should be enabled. It doesn't
> > happen that the slotsync worker tries to decode non-logical WAL
> > records even if it advances the slot after the startup disabled
> > logical decoding.
>
> Let me confirm your point. If the situation, which the slot is dropped and then
> created while the startup process processing, happens, the WAL records would be
> aligned like below. Your point is that the restart_lsn of the created slot is
> beginning of (b) so that all records can be decoded, right?
>
> ```
> STATUS_CHANGE true
> RUNNING_XACTS                   // (a) - generated by the first slot
> ...
> STATUS_CHANGE false             // due to the slot drop
> ...
> STATUS_CHANGE true              // from here all records are decode-safe
> RUNNING_XACTS                   // (b) - generated by the second slot, restart_lsn can set here
> ```

Yes. If I understand it correctly, even when the startup is processing
the second STATUS_CHANGE record (i.e., disabling logical decoding),
the synced slot uses the corresponding remote slot's restart_lsn,
i.e., (b). I believe that if the standby has not received the
RUNNING_XACT(b) yet at that point, the slotsync worker skips to sync
the slot (see the check at the top of synchronize_one_slot()).

>
> > how efficiently to fix it. I've considered a simple idea that the
> > slotsync worker checks IsLogicalDecodingEnabled() before trying to
> > sync one logical slot. However, it doesn't solve the race condition;
> > the startup process can disable logical decoding right after the
> > slotsync passed the check, in which case users would see the logical
> > slot is created after logical decoding is disabled.
>
> So... even if we can add check in decoding functions, the startup process can
> disable the logical decoding after that, is it also right?

I think so. I think IsLogicalDecodingEnabled() check is a check
whether a process can start logical decoding, but doesn't cover
already running logical decoding processes. The slot invalidation
mechanism is responsible for that.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Fri, Aug 29, 2025 at 2:46 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
>
> Hi Sawada-san,
>
> Thanks for the updated patch.
>
> I have a doubt. When we create publication (when wal_level is set to
> replica) we get a warning:
> WARNING:  logical decoding needs to be enabled to publish logical changes
> HINT:  Before creating subscriptions, set "wal_level" >= "logical" or
> create a logical replication slot when "wal_level" = "replica".
>
> The hint suggests that when wal_level = 'replica', before creating a
> subscription, we should create logical slots on the publisher. But
> when I tested this scenario, I created a subscription (without having
> a prior logical slot on the publisher). The operation was successful,
> the effective_wal_level was set appropriately and logical replication
> was working fine. I think this happens because the CREATE SUBSCRIPTION
> command itself creates a logical slot on the publisher.
>
> Should we update the HINT message here?

Thank you for the comment! I believe the point is whether to hint that
creating a subscription is a third way to enable logical decoding. I'm
concerned that it could be redundant as CREATE SUBSCRIPTION with
create_slot=true actually creates the logical slot as you mentioned,
and we already mentioned it. If we add it to the hink, we would have
to mention it as well when we have commands in the future that
internally creates a logical slot. I think that it's prudent to
mention the minimum requirement to enable logical decoding in the
hint.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
shveta malik
Дата:
Few trivial comments for doc:



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
shveta malik
Дата:
On Tue, Sep 2, 2025 at 10:24 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> Few trivial comments for doc:

Sorry, the email got sent without comments.

1)
+        It is important to note that when
<varname>wal_level</varname> is set to
+        <literal>replica</literal> the effective WAL level can
automatically change

comma after replica missing.

2)
Do we need to mention it as CAUTION somewhere that the last logical
slot drop may disable logical decoding on primary resulting in slots
invalidation on standby?

thanks
Shveta



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Amit Kapila
Дата:
On Fri, Aug 29, 2025 at 9:38 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> I've attached the updated patch.
>

Few comments:
=============
1.
+ * When XLogLogicalInfoActive() is true, guarantee that a subtransaction's
+ * xid can only be seen in the WAL stream if its toplevel xid has been
+ * logged before. If necessary we log an xact_assignment record with fewer
+ * than PGPROC_MAX_CACHED_SUBXIDS. Note that it is fine if didLogXid isn't
+ * set for a transaction even though it appears in a WAL record, we just
+ * might superfluously log something. That can happen when an xid is
+ * included somewhere inside a wal record, but not in XLogRecord->xl_xid,
+ * like in xl_standby_locks.
  */
  if (isSubXact && XLogLogicalInfoActive() &&
  !TopTransactionStateData.didLogXid)

Instead of writing XLogLogicalInfoActive() is true in comments, can we
say When effective wal_level is logical and then also point to some
place if required where the patch has explained about effective
wal_level? Otherwise, it sounds like we are writing what is apparent
from code and may not be very clear.

2.
- /*
- * Invalidate logical slots if we are in hot standby and the primary
- * does not have a WAL level sufficient for logical decoding. No need
- * to search for potentially conflicting logically slots if standby is
- * running with wal_level lower than logical, because in that case, we
- * would have either disallowed creation of logical slots or
- * invalidated existing ones.
- */
- if (InRecovery && InHotStandby &&
- xlrec.wal_level < WAL_LEVEL_LOGICAL &&
- wal_level >= WAL_LEVEL_LOGICAL)
- InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_LEVEL,
-    0, InvalidOid,
-    InvalidTransactionId);
-
  LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
  ControlFile->MaxConnections = xlrec.MaxConnections;
  ControlFile->max_worker_processes = xlrec.max_worker_processes;
@@ -8605,6 +8643,50 @@ xlog_redo(XLogReaderState *record)
  {
  /* nothing to do here, just for informational purposes */
  }
+ else if (info == XLOG_LOGICAL_DECODING_STATUS_CHANGE)
+ {
+ bool logical_decoding;
+
+ /* Update the status on shared memory */
+ memcpy(&logical_decoding, XLogRecGetData(record), sizeof(bool));
+ UpdateLogicalDecodingStatus(logical_decoding, true);
+
+ if (InRecovery && InHotStandby)
+ {
+ if (!logical_decoding)

Like previously, shouldn't we have a check for standby's wal_level as
well due to the reasons mentioned in the removed comments?

3.
+ errmsg("logical decoding needs to be enabled to publish logical changes"),

This message doesn't sound intuitive. How about "logical decoding
should be allowed to publish logical changes"?

4.
+ else if (info == XLOG_LOGICAL_DECODING_STATUS_CHANGE)
...
+ /*
+ * Request to launch or shutdown the slotsync worker depending on
+ * the new logical decoding status.
+ */

If we see a similar part in existing code as a handling of
XLOG_PARAMETER_CHANGE, we don't shutdown or restart slotsync worker,
so why do it as part of this patch? This new behaviour may be better
but shouldn't we try to handle it as a separate HEAD patch? Also, a
few additional comments explaining the rationale behind this would be
good.

5.
Assert(RecoveryInProgress());
  ereport(ERROR,
  (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
- errmsg("logical decoding on standby requires \"wal_level\" >=
\"logical\" on the primary")));
+ errmsg("logical decoding must be enabled on the primary")));

Can't we keep the tone of the existing message as it is? How about
"logical decoding on standby requires \"effective_wal_level\" >=
\"logical\" on the primary"? Also, if we agree with this, we could
have a similar change for other messages in the patch.

6. Can we write some comments as to why we didn't support wal_level to
be changed from minimal to logical? It will be helpful for future
readers/authors to understand what it would require to further extend
this functionality.

--
With Regards,
Amit Kapila.



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Shlok Kyal
Дата:
On Tue, 2 Sept 2025 at 17:05, Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Aug 29, 2025 at 9:38 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I've attached the updated patch.
> >
>
> Few comments:
> =============
> 1.
> + * When XLogLogicalInfoActive() is true, guarantee that a subtransaction's
> + * xid can only be seen in the WAL stream if its toplevel xid has been
> + * logged before. If necessary we log an xact_assignment record with fewer
> + * than PGPROC_MAX_CACHED_SUBXIDS. Note that it is fine if didLogXid isn't
> + * set for a transaction even though it appears in a WAL record, we just
> + * might superfluously log something. That can happen when an xid is
> + * included somewhere inside a wal record, but not in XLogRecord->xl_xid,
> + * like in xl_standby_locks.
>   */
>   if (isSubXact && XLogLogicalInfoActive() &&
>   !TopTransactionStateData.didLogXid)
>
> Instead of writing XLogLogicalInfoActive() is true in comments, can we
> say When effective wal_level is logical and then also point to some
> place if required where the patch has explained about effective
> wal_level? Otherwise, it sounds like we are writing what is apparent
> from code and may not be very clear.
>
> 2.
> - /*
> - * Invalidate logical slots if we are in hot standby and the primary
> - * does not have a WAL level sufficient for logical decoding. No need
> - * to search for potentially conflicting logically slots if standby is
> - * running with wal_level lower than logical, because in that case, we
> - * would have either disallowed creation of logical slots or
> - * invalidated existing ones.
> - */
> - if (InRecovery && InHotStandby &&
> - xlrec.wal_level < WAL_LEVEL_LOGICAL &&
> - wal_level >= WAL_LEVEL_LOGICAL)
> - InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_LEVEL,
> -    0, InvalidOid,
> -    InvalidTransactionId);
> -
>   LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
>   ControlFile->MaxConnections = xlrec.MaxConnections;
>   ControlFile->max_worker_processes = xlrec.max_worker_processes;
> @@ -8605,6 +8643,50 @@ xlog_redo(XLogReaderState *record)
>   {
>   /* nothing to do here, just for informational purposes */
>   }
> + else if (info == XLOG_LOGICAL_DECODING_STATUS_CHANGE)
> + {
> + bool logical_decoding;
> +
> + /* Update the status on shared memory */
> + memcpy(&logical_decoding, XLogRecGetData(record), sizeof(bool));
> + UpdateLogicalDecodingStatus(logical_decoding, true);
> +
> + if (InRecovery && InHotStandby)
> + {
> + if (!logical_decoding)
>
> Like previously, shouldn't we have a check for standby's wal_level as
> well due to the reasons mentioned in the removed comments?
>
> 3.
> + errmsg("logical decoding needs to be enabled to publish logical changes"),
>
> This message doesn't sound intuitive. How about "logical decoding
> should be allowed to publish logical changes"?
>
> 4.
> + else if (info == XLOG_LOGICAL_DECODING_STATUS_CHANGE)
> ...
> + /*
> + * Request to launch or shutdown the slotsync worker depending on
> + * the new logical decoding status.
> + */
>
> If we see a similar part in existing code as a handling of
> XLOG_PARAMETER_CHANGE, we don't shutdown or restart slotsync worker,
> so why do it as part of this patch? This new behaviour may be better
> but shouldn't we try to handle it as a separate HEAD patch? Also, a
> few additional comments explaining the rationale behind this would be
> good.
>

I tested the behaviour with HEAD and with Patch. And I confirmed the
change in behaviour between HEAD and Patch

Suppose we have a primary and a standby with wal_level = logical and
guc parameters to enable slot sync worker are set accordingly. A slot
sync worker will be running.
Now we change the value of wal_level for primary to replica. And
restart the primary server

With HEAD, during restart the existing sync_slot_worker will exit with:
2025-09-02 11:49:08.846 IST [3877882] ERROR:  synchronization worker
"" could not connect to the primary server: connection to server at
"localhost" (127.0.0.1), port 5432 failed: Connection refused
    Is the server running on that host and accepting TCP/IP connections?
2025-09-02 11:49:11.380 IST [3877885] FATAL:  streaming replication
receiver "walreceiver" could not connect to the primary server:
connection to server at "localhost" (127.0.0.1), port 5432 failed:
Connection refused
    Is the server running on that host and accepting TCP/IP connections?

and after the restart of the primary server, slot sync worker will
restart and it is able to connect to the primary.

With Patch, during restart the existing sync_slot_worker will exit.
But after the restart of the primary server, slot sync worker cannot
start and we can see following log:
2025-09-02 12:44:51.497 IST [3947520] LOG:  replication slot
synchronization worker is shutting down on receiving SIGINT
2025-09-02 12:44:51.498 IST [3943504] LOG:  replication slot
synchronization requires logical decoding to be enabled
2025-09-02 12:44:51.498 IST [3943504] HINT:  To enable logical
decoding on primary, set "wal_level" >= "logical" or create at least
one logical slot when "wal_level" = "replica".
2025-09-02 12:45:51.537 IST [3943504] LOG:  replication slot
synchronization requires logical decoding to be enabled
2025-09-02 12:45:51.537 IST [3943504] HINT:  To enable logical
decoding on primary, set "wal_level" >= "logical" or create at least
one logical slot when "wal_level" = "replica".

So, with HEAD, after we restart the primary server with 'wal_level =
replica', the slot sync worker can restart and connect to the primary
but with patch it cannot start after restart due to the check in
ValidateSlotSyncParams.

> 5.
> Assert(RecoveryInProgress());
>   ereport(ERROR,
>   (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> - errmsg("logical decoding on standby requires \"wal_level\" >=
> \"logical\" on the primary")));
> + errmsg("logical decoding must be enabled on the primary")));
>
> Can't we keep the tone of the existing message as it is? How about
> "logical decoding on standby requires \"effective_wal_level\" >=
> \"logical\" on the primary"? Also, if we agree with this, we could
> have a similar change for other messages in the patch.
>
> 6. Can we write some comments as to why we didn't support wal_level to
> be changed from minimal to logical? It will be helpful for future
> readers/authors to understand what it would require to further extend
> this functionality.
>

Thanks,
Shlok Kyal



RE: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
"Hayato Kuroda (Fujitsu)"
Дата:
Dear Sawada-san,

Here are my comments.

01.
```
    checkPoint.logicalDecodingEnabled = IsLogicalDecodingEnabled();
```

Per my analysis, the value is always false here because StartupLogicalDecodingStatus
is not called yet. Can we use "false" directly?

02.
```
elog(DEBUG1, "waiting for %d transactions to complete", running->xcnt);
```

Here plural form is always used even if the running transaction is only one.
How about something like:
```
Number of transactions to wait finishing: %d
```

03.
```
        while (RecoveryInProgress())
        {
            pgstat_report_wait_start(WAIT_EVENT_LOGICAL_DECODING_STATUS_CHANGE_DELAY);
            pg_usleep(100000L); /* wait for 100 msec */
            pgstat_report_wait_end();
        }
```

I found a stuck case here: if a backend process within the loop and startup waits
a signal is processed, both of them can stuck. The backend waits the recovery
state to be DONE, and the startup waits all processes handle consume the signal.
IIUC we must add CHECK_FOR_INTERRUPTS() or ProcessProcSignalBarrier().

Actual steps:

0.  constructed a streaming replication system, which the only primary server had
    a logical slot. I.e., the effective_wal_level was logical.
1.  connected to a standby node
2.  attached to the backend process via gdb
3.  added a breakpoint at create_logical_replication_slot
4.  called pg_create_logical_replication_slot() on the backend.
    the backend will stop before ReplicationSlotCreate().
5.  from another terminal, attached to the startup process via gdb
6.  added a breakpoint at UpdateLogicalDecodingStatusEndOfRecovery()
7.  from another terminal, send a promote signal to the standby.
    The startup will stop at UpdateLogicalDecodingStatusEndOfRecovery()
8.  executed steps on startup process, untill delay_status_change was updated
    and LogicalDecodingControlLock was released.
9.  detached from the backend process. It would stop at the loop in 
    start_logical_decoding_status_change().
10. detached from the startup process. It would wait all processes handled the
    signal, but the backend won't do.

Best regards,
Hayato Kuroda
FUJITSU LIMITED


Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Tue, Sep 2, 2025 at 4:35 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Aug 29, 2025 at 9:38 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I've attached the updated patch.
> >
>
> Few comments:

Thank you for the comments!

> =============
> 1.
> + * When XLogLogicalInfoActive() is true, guarantee that a subtransaction's
> + * xid can only be seen in the WAL stream if its toplevel xid has been
> + * logged before. If necessary we log an xact_assignment record with fewer
> + * than PGPROC_MAX_CACHED_SUBXIDS. Note that it is fine if didLogXid isn't
> + * set for a transaction even though it appears in a WAL record, we just
> + * might superfluously log something. That can happen when an xid is
> + * included somewhere inside a wal record, but not in XLogRecord->xl_xid,
> + * like in xl_standby_locks.
>   */
>   if (isSubXact && XLogLogicalInfoActive() &&
>   !TopTransactionStateData.didLogXid)
>
> Instead of writing XLogLogicalInfoActive() is true in comments, can we
> say When effective wal_level is logical and then also point to some
> place if required where the patch has explained about effective
> wal_level? Otherwise, it sounds like we are writing what is apparent
> from code and may not be very clear.

Agreed.

>
> 2.
> - /*
> - * Invalidate logical slots if we are in hot standby and the primary
> - * does not have a WAL level sufficient for logical decoding. No need
> - * to search for potentially conflicting logically slots if standby is
> - * running with wal_level lower than logical, because in that case, we
> - * would have either disallowed creation of logical slots or
> - * invalidated existing ones.
> - */
> - if (InRecovery && InHotStandby &&
> - xlrec.wal_level < WAL_LEVEL_LOGICAL &&
> - wal_level >= WAL_LEVEL_LOGICAL)
> - InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_LEVEL,
> -    0, InvalidOid,
> -    InvalidTransactionId);
> -
>   LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
>   ControlFile->MaxConnections = xlrec.MaxConnections;
>   ControlFile->max_worker_processes = xlrec.max_worker_processes;
> @@ -8605,6 +8643,50 @@ xlog_redo(XLogReaderState *record)
>   {
>   /* nothing to do here, just for informational purposes */
>   }
> + else if (info == XLOG_LOGICAL_DECODING_STATUS_CHANGE)
> + {
> + bool logical_decoding;
> +
> + /* Update the status on shared memory */
> + memcpy(&logical_decoding, XLogRecGetData(record), sizeof(bool));
> + UpdateLogicalDecodingStatus(logical_decoding, true);
> +
> + if (InRecovery && InHotStandby)
> + {
> + if (!logical_decoding)
>
> Like previously, shouldn't we have a check for standby's wal_level as
> well due to the reasons mentioned in the removed comments?

IIUC we need to replay the STATUS_CHANGE record when wal_level is set
to 'replica' or 'logical'. If we want to add a check for standby's
wal_level, the check would be "wal_level >= WAL_LEVEL_REPLICA" but it
would be redundant as we already checked "InRecovery && InHotStandby".

> 3.
> + errmsg("logical decoding needs to be enabled to publish logical changes"),
>
> This message doesn't sound intuitive. How about "logical decoding
> should be allowed to publish logical changes"?

Agreed.

>
> 4.
> + else if (info == XLOG_LOGICAL_DECODING_STATUS_CHANGE)
> ...
> + /*
> + * Request to launch or shutdown the slotsync worker depending on
> + * the new logical decoding status.
> + */
>
> If we see a similar part in existing code as a handling of
> XLOG_PARAMETER_CHANGE, we don't shutdown or restart slotsync worker,
> so why do it as part of this patch? This new behaviour may be better
> but shouldn't we try to handle it as a separate HEAD patch? Also, a
> few additional comments explaining the rationale behind this would be
> good.

Right, it seems out of scope. Removed that part.

> 5.
> Assert(RecoveryInProgress());
>   ereport(ERROR,
>   (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> - errmsg("logical decoding on standby requires \"wal_level\" >=
> \"logical\" on the primary")));
> + errmsg("logical decoding must be enabled on the primary")));
>
> Can't we keep the tone of the existing message as it is? How about
> "logical decoding on standby requires \"effective_wal_level\" >=
> \"logical\" on the primary"? Also, if we agree with this, we could
> have a similar change for other messages in the patch.

Agreed and changed all related places.

> 6. Can we write some comments as to why we didn't support wal_level to
> be changed from minimal to logical? It will be helpful for future
> readers/authors to understand what it would require to further extend
> this functionality.

Good point, I'll add the explanation.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
od On Tue, Sep 2, 2025 at 8:11 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Dear Sawada-san,
>
> Here are my comments.
>
> 01.
> ```
>         checkPoint.logicalDecodingEnabled = IsLogicalDecodingEnabled();
> ```
>
> Per my analysis, the value is always false here because StartupLogicalDecodingStatus
> is not called yet. Can we use "false" directly?

I think that it's better to read the shared flag instead of directly
setting false since LogicalDecodingCtl is already initialized.

>
> 02.
> ```
> elog(DEBUG1, "waiting for %d transactions to complete", running->xcnt);
> ```
>
> Here plural form is always used even if the running transaction is only one.
> How about something like:
> ```
> Number of transactions to wait finishing: %d
> ```

Hmm, not sure it improves the message. I think we don't care much
about plurals in debug messages. And in our convention the main log
message doesn't start a capital character.


>
> 03.
> ```
>                 while (RecoveryInProgress())
>                 {
>                         pgstat_report_wait_start(WAIT_EVENT_LOGICAL_DECODING_STATUS_CHANGE_DELAY);
>                         pg_usleep(100000L); /* wait for 100 msec */
>                         pgstat_report_wait_end();
>                 }
> ```
>
> I found a stuck case here: if a backend process within the loop and startup waits
> a signal is processed, both of them can stuck. The backend waits the recovery
> state to be DONE, and the startup waits all processes handle consume the signal.
> IIUC we must add CHECK_FOR_INTERRUPTS() or ProcessProcSignalBarrier().
>
> Actual steps:
>
> 0.  constructed a streaming replication system, which the only primary server had
>     a logical slot. I.e., the effective_wal_level was logical.
> 1.  connected to a standby node
> 2.  attached to the backend process via gdb
> 3.  added a breakpoint at create_logical_replication_slot
> 4.  called pg_create_logical_replication_slot() on the backend.
>     the backend will stop before ReplicationSlotCreate().
> 5.  from another terminal, attached to the startup process via gdb
> 6.  added a breakpoint at UpdateLogicalDecodingStatusEndOfRecovery()
> 7.  from another terminal, send a promote signal to the standby.
>     The startup will stop at UpdateLogicalDecodingStatusEndOfRecovery()
> 8.  executed steps on startup process, untill delay_status_change was updated
>     and LogicalDecodingControlLock was released.
> 9.  detached from the backend process. It would stop at the loop in
>     start_logical_decoding_status_change().
> 10. detached from the startup process. It would wait all processes handled the
>     signal, but the backend won't do.

Good find! I'll fix the problem by adding CHECK_FOR_INTERRUPTS() as
you suggested.



Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Thu, Sep 4, 2025 at 11:15 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> od On Tue, Sep 2, 2025 at 8:11 PM Hayato Kuroda (Fujitsu)
> <kuroda.hayato@fujitsu.com> wrote:
> >
> > Dear Sawada-san,
> >
> > Here are my comments.
> >
> > 01.
> > ```
> >         checkPoint.logicalDecodingEnabled = IsLogicalDecodingEnabled();
> > ```
> >
> > Per my analysis, the value is always false here because StartupLogicalDecodingStatus
> > is not called yet. Can we use "false" directly?
>
> I think that it's better to read the shared flag instead of directly
> setting false since LogicalDecodingCtl is already initialized.
>
> >
> > 02.
> > ```
> > elog(DEBUG1, "waiting for %d transactions to complete", running->xcnt);
> > ```
> >
> > Here plural form is always used even if the running transaction is only one.
> > How about something like:
> > ```
> > Number of transactions to wait finishing: %d
> > ```
>
> Hmm, not sure it improves the message. I think we don't care much
> about plurals in debug messages. And in our convention the main log
> message doesn't start a capital character.
>
>
> >
> > 03.
> > ```
> >                 while (RecoveryInProgress())
> >                 {
> >                         pgstat_report_wait_start(WAIT_EVENT_LOGICAL_DECODING_STATUS_CHANGE_DELAY);
> >                         pg_usleep(100000L); /* wait for 100 msec */
> >                         pgstat_report_wait_end();
> >                 }
> > ```
> >
> > I found a stuck case here: if a backend process within the loop and startup waits
> > a signal is processed, both of them can stuck. The backend waits the recovery
> > state to be DONE, and the startup waits all processes handle consume the signal.
> > IIUC we must add CHECK_FOR_INTERRUPTS() or ProcessProcSignalBarrier().
> >
> > Actual steps:
> >
> > 0.  constructed a streaming replication system, which the only primary server had
> >     a logical slot. I.e., the effective_wal_level was logical.
> > 1.  connected to a standby node
> > 2.  attached to the backend process via gdb
> > 3.  added a breakpoint at create_logical_replication_slot
> > 4.  called pg_create_logical_replication_slot() on the backend.
> >     the backend will stop before ReplicationSlotCreate().
> > 5.  from another terminal, attached to the startup process via gdb
> > 6.  added a breakpoint at UpdateLogicalDecodingStatusEndOfRecovery()
> > 7.  from another terminal, send a promote signal to the standby.
> >     The startup will stop at UpdateLogicalDecodingStatusEndOfRecovery()
> > 8.  executed steps on startup process, untill delay_status_change was updated
> >     and LogicalDecodingControlLock was released.
> > 9.  detached from the backend process. It would stop at the loop in
> >     start_logical_decoding_status_change().
> > 10. detached from the startup process. It would wait all processes handled the
> >     signal, but the backend won't do.
>
> Good find! I'll fix the problem by adding CHECK_FOR_INTERRUPTS() as
> you suggested.

I've attached the updated patch that incorporated all comments I got so far.

FYI, I've been doing extensive long-running stability tests in my
environment for several days. The testing setup involves a
primary-standby replication configuration where logical decoding is
repeatedly enabled and disabled on the primary server. I verify that
there are no activation or deactivation processes failures and confirm
that non-logical WAL records are written when logical decoding is
enabled. Additionally, I perform repeated failovers too. While no
issues have been identified so far, I plan to continue testing with
varied workloads and configurations.

One concern about this patch: although the patch's core concept is
straightforward, I'm particularly concerned about the requirement to
write a STATUS_CHANGE WAL record when dropping the last logical slot,
which could occur during process shutdown (specifically via
before_shmem callback). While the process forgets pending
cancellations and holds interrupts during shutdown callback execution,
the WAL record writing operation could involve waits and disk I/O etc.
Although our testing hasn't revealed any related issues, I'm concerned
this could become problematic in the future. I'd be happy to hear
opinions on this matter and am also open to alternative approaches in
terms of how to disable logical decoding.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Вложения

Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Tue, Sep 2, 2025 at 5:12 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
>
>
> I tested the behaviour with HEAD and with Patch. And I confirmed the
> change in behaviour between HEAD and Patch
>
> Suppose we have a primary and a standby with wal_level = logical and
> guc parameters to enable slot sync worker are set accordingly. A slot
> sync worker will be running.
> Now we change the value of wal_level for primary to replica. And
> restart the primary server
>
> With HEAD, during restart the existing sync_slot_worker will exit with:
> 2025-09-02 11:49:08.846 IST [3877882] ERROR:  synchronization worker
> "" could not connect to the primary server: connection to server at
> "localhost" (127.0.0.1), port 5432 failed: Connection refused
> Is the server running on that host and accepting TCP/IP connections?
> 2025-09-02 11:49:11.380 IST [3877885] FATAL:  streaming replication
> receiver "walreceiver" could not connect to the primary server:
> connection to server at "localhost" (127.0.0.1), port 5432 failed:
> Connection refused
> Is the server running on that host and accepting TCP/IP connections?
>
> and after the restart of the primary server, slot sync worker will
> restart and it is able to connect to the primary.
>
> With Patch, during restart the existing sync_slot_worker will exit.
> But after the restart of the primary server, slot sync worker cannot
> start and we can see following log:
> 2025-09-02 12:44:51.497 IST [3947520] LOG:  replication slot
> synchronization worker is shutting down on receiving SIGINT
> 2025-09-02 12:44:51.498 IST [3943504] LOG:  replication slot
> synchronization requires logical decoding to be enabled
> 2025-09-02 12:44:51.498 IST [3943504] HINT:  To enable logical
> decoding on primary, set "wal_level" >= "logical" or create at least
> one logical slot when "wal_level" = "replica".
> 2025-09-02 12:45:51.537 IST [3943504] LOG:  replication slot
> synchronization requires logical decoding to be enabled
> 2025-09-02 12:45:51.537 IST [3943504] HINT:  To enable logical
> decoding on primary, set "wal_level" >= "logical" or create at least
> one logical slot when "wal_level" = "replica".
>
> So, with HEAD, after we restart the primary server with 'wal_level =
> replica', the slot sync worker can restart and connect to the primary
> but with patch it cannot start after restart due to the check in
> ValidateSlotSyncParams.

But the slotsync worker is launched again once logical decoding is
enabled, no? I'm not sure that we want to launch the slotsync worker
also when we know logical decoding is not enabled.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Fri, Sep 5, 2025 at 3:15 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Sep 4, 2025 at 11:15 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > od On Tue, Sep 2, 2025 at 8:11 PM Hayato Kuroda (Fujitsu)
> > <kuroda.hayato@fujitsu.com> wrote:
> > >
> > > Dear Sawada-san,
> > >
> > > Here are my comments.
> > >
> > > 01.
> > > ```
> > >         checkPoint.logicalDecodingEnabled = IsLogicalDecodingEnabled();
> > > ```
> > >
> > > Per my analysis, the value is always false here because StartupLogicalDecodingStatus
> > > is not called yet. Can we use "false" directly?
> >
> > I think that it's better to read the shared flag instead of directly
> > setting false since LogicalDecodingCtl is already initialized.
> >
> > >
> > > 02.
> > > ```
> > > elog(DEBUG1, "waiting for %d transactions to complete", running->xcnt);
> > > ```
> > >
> > > Here plural form is always used even if the running transaction is only one.
> > > How about something like:
> > > ```
> > > Number of transactions to wait finishing: %d
> > > ```
> >
> > Hmm, not sure it improves the message. I think we don't care much
> > about plurals in debug messages. And in our convention the main log
> > message doesn't start a capital character.
> >
> >
> > >
> > > 03.
> > > ```
> > >                 while (RecoveryInProgress())
> > >                 {
> > >                         pgstat_report_wait_start(WAIT_EVENT_LOGICAL_DECODING_STATUS_CHANGE_DELAY);
> > >                         pg_usleep(100000L); /* wait for 100 msec */
> > >                         pgstat_report_wait_end();
> > >                 }
> > > ```
> > >
> > > I found a stuck case here: if a backend process within the loop and startup waits
> > > a signal is processed, both of them can stuck. The backend waits the recovery
> > > state to be DONE, and the startup waits all processes handle consume the signal.
> > > IIUC we must add CHECK_FOR_INTERRUPTS() or ProcessProcSignalBarrier().
> > >
> > > Actual steps:
> > >
> > > 0.  constructed a streaming replication system, which the only primary server had
> > >     a logical slot. I.e., the effective_wal_level was logical.
> > > 1.  connected to a standby node
> > > 2.  attached to the backend process via gdb
> > > 3.  added a breakpoint at create_logical_replication_slot
> > > 4.  called pg_create_logical_replication_slot() on the backend.
> > >     the backend will stop before ReplicationSlotCreate().
> > > 5.  from another terminal, attached to the startup process via gdb
> > > 6.  added a breakpoint at UpdateLogicalDecodingStatusEndOfRecovery()
> > > 7.  from another terminal, send a promote signal to the standby.
> > >     The startup will stop at UpdateLogicalDecodingStatusEndOfRecovery()
> > > 8.  executed steps on startup process, untill delay_status_change was updated
> > >     and LogicalDecodingControlLock was released.
> > > 9.  detached from the backend process. It would stop at the loop in
> > >     start_logical_decoding_status_change().
> > > 10. detached from the startup process. It would wait all processes handled the
> > >     signal, but the backend won't do.
> >
> > Good find! I'll fix the problem by adding CHECK_FOR_INTERRUPTS() as
> > you suggested.
>
> I've attached the updated patch that incorporated all comments I got so far.

Sorry, I've attached the wrong version. Please find the attached correct one.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Вложения

Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Amit Kapila
Дата:
On Thu, Sep 4, 2025 at 1:24 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, Sep 2, 2025 at 4:35 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, Aug 29, 2025 at 9:38 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > I've attached the updated patch.
> > >
> >
> > Few comments:
>
> Thank you for the comments!
>
> > =============
> > 1.
> > + * When XLogLogicalInfoActive() is true, guarantee that a subtransaction's
> > + * xid can only be seen in the WAL stream if its toplevel xid has been
> > + * logged before. If necessary we log an xact_assignment record with fewer
> > + * than PGPROC_MAX_CACHED_SUBXIDS. Note that it is fine if didLogXid isn't
> > + * set for a transaction even though it appears in a WAL record, we just
> > + * might superfluously log something. That can happen when an xid is
> > + * included somewhere inside a wal record, but not in XLogRecord->xl_xid,
> > + * like in xl_standby_locks.
> >   */
> >   if (isSubXact && XLogLogicalInfoActive() &&
> >   !TopTransactionStateData.didLogXid)
> >
> > Instead of writing XLogLogicalInfoActive() is true in comments, can we
> > say When effective wal_level is logical and then also point to some
> > place if required where the patch has explained about effective
> > wal_level? Otherwise, it sounds like we are writing what is apparent
> > from code and may not be very clear.
>
> Agreed.
>
> >
> > 2.
> > - /*
> > - * Invalidate logical slots if we are in hot standby and the primary
> > - * does not have a WAL level sufficient for logical decoding. No need
> > - * to search for potentially conflicting logically slots if standby is
> > - * running with wal_level lower than logical, because in that case, we
> > - * would have either disallowed creation of logical slots or
> > - * invalidated existing ones.
> > - */
> > - if (InRecovery && InHotStandby &&
> > - xlrec.wal_level < WAL_LEVEL_LOGICAL &&
> > - wal_level >= WAL_LEVEL_LOGICAL)
> > - InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_LEVEL,
> > -    0, InvalidOid,
> > -    InvalidTransactionId);
> > -
> >   LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
> >   ControlFile->MaxConnections = xlrec.MaxConnections;
> >   ControlFile->max_worker_processes = xlrec.max_worker_processes;
> > @@ -8605,6 +8643,50 @@ xlog_redo(XLogReaderState *record)
> >   {
> >   /* nothing to do here, just for informational purposes */
> >   }
> > + else if (info == XLOG_LOGICAL_DECODING_STATUS_CHANGE)
> > + {
> > + bool logical_decoding;
> > +
> > + /* Update the status on shared memory */
> > + memcpy(&logical_decoding, XLogRecGetData(record), sizeof(bool));
> > + UpdateLogicalDecodingStatus(logical_decoding, true);
> > +
> > + if (InRecovery && InHotStandby)
> > + {
> > + if (!logical_decoding)
> >
> > Like previously, shouldn't we have a check for standby's wal_level as
> > well due to the reasons mentioned in the removed comments?
>
> IIUC we need to replay the STATUS_CHANGE record when wal_level is set
> to 'replica' or 'logical'. If we want to add a check for standby's
> wal_level, the check would be "wal_level >= WAL_LEVEL_REPLICA" but it
> would be redundant as we already checked "InRecovery && InHotStandby".
>

If we want to mimic the current implementation, won't
effective_wal_level be 'logical' even on standby? Otherwise, there
shouldn't be any logical slots which can be invalidated.

--
With Regards,
Amit Kapila.



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Amit Kapila
Дата:
On Sat, Sep 6, 2025 at 3:58 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, Sep 2, 2025 at 5:12 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
> >
> >
> > I tested the behaviour with HEAD and with Patch. And I confirmed the
> > change in behaviour between HEAD and Patch
> >
> > Suppose we have a primary and a standby with wal_level = logical and
> > guc parameters to enable slot sync worker are set accordingly. A slot
> > sync worker will be running.
> > Now we change the value of wal_level for primary to replica. And
> > restart the primary server
> >
> > With HEAD, during restart the existing sync_slot_worker will exit with:
> > 2025-09-02 11:49:08.846 IST [3877882] ERROR:  synchronization worker
> > "" could not connect to the primary server: connection to server at
> > "localhost" (127.0.0.1), port 5432 failed: Connection refused
> > Is the server running on that host and accepting TCP/IP connections?
> > 2025-09-02 11:49:11.380 IST [3877885] FATAL:  streaming replication
> > receiver "walreceiver" could not connect to the primary server:
> > connection to server at "localhost" (127.0.0.1), port 5432 failed:
> > Connection refused
> > Is the server running on that host and accepting TCP/IP connections?
> >
> > and after the restart of the primary server, slot sync worker will
> > restart and it is able to connect to the primary.
> >
> > With Patch, during restart the existing sync_slot_worker will exit.
> > But after the restart of the primary server, slot sync worker cannot
> > start and we can see following log:
> > 2025-09-02 12:44:51.497 IST [3947520] LOG:  replication slot
> > synchronization worker is shutting down on receiving SIGINT
> > 2025-09-02 12:44:51.498 IST [3943504] LOG:  replication slot
> > synchronization requires logical decoding to be enabled
> > 2025-09-02 12:44:51.498 IST [3943504] HINT:  To enable logical
> > decoding on primary, set "wal_level" >= "logical" or create at least
> > one logical slot when "wal_level" = "replica".
> > 2025-09-02 12:45:51.537 IST [3943504] LOG:  replication slot
> > synchronization requires logical decoding to be enabled
> > 2025-09-02 12:45:51.537 IST [3943504] HINT:  To enable logical
> > decoding on primary, set "wal_level" >= "logical" or create at least
> > one logical slot when "wal_level" = "replica".
> >
> > So, with HEAD, after we restart the primary server with 'wal_level =
> > replica', the slot sync worker can restart and connect to the primary
> > but with patch it cannot start after restart due to the check in
> > ValidateSlotSyncParams.
>
> But the slotsync worker is launched again once logical decoding is
> enabled, no? I'm not sure that we want to launch the slotsync worker
> also when we know logical decoding is not enabled.
>

Why in the first place the logical_decoding enabled check has failed
because IIUC, the wal_level on standby is still 'logical'?

--
With Regards,
Amit Kapila.



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Fri, Sep 5, 2025 at 8:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Sep 4, 2025 at 1:24 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Tue, Sep 2, 2025 at 4:35 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Fri, Aug 29, 2025 at 9:38 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > I've attached the updated patch.
> > > >
> > >
> > > Few comments:
> >
> > Thank you for the comments!
> >
> > > =============
> > > 1.
> > > + * When XLogLogicalInfoActive() is true, guarantee that a subtransaction's
> > > + * xid can only be seen in the WAL stream if its toplevel xid has been
> > > + * logged before. If necessary we log an xact_assignment record with fewer
> > > + * than PGPROC_MAX_CACHED_SUBXIDS. Note that it is fine if didLogXid isn't
> > > + * set for a transaction even though it appears in a WAL record, we just
> > > + * might superfluously log something. That can happen when an xid is
> > > + * included somewhere inside a wal record, but not in XLogRecord->xl_xid,
> > > + * like in xl_standby_locks.
> > >   */
> > >   if (isSubXact && XLogLogicalInfoActive() &&
> > >   !TopTransactionStateData.didLogXid)
> > >
> > > Instead of writing XLogLogicalInfoActive() is true in comments, can we
> > > say When effective wal_level is logical and then also point to some
> > > place if required where the patch has explained about effective
> > > wal_level? Otherwise, it sounds like we are writing what is apparent
> > > from code and may not be very clear.
> >
> > Agreed.
> >
> > >
> > > 2.
> > > - /*
> > > - * Invalidate logical slots if we are in hot standby and the primary
> > > - * does not have a WAL level sufficient for logical decoding. No need
> > > - * to search for potentially conflicting logically slots if standby is
> > > - * running with wal_level lower than logical, because in that case, we
> > > - * would have either disallowed creation of logical slots or
> > > - * invalidated existing ones.
> > > - */
> > > - if (InRecovery && InHotStandby &&
> > > - xlrec.wal_level < WAL_LEVEL_LOGICAL &&
> > > - wal_level >= WAL_LEVEL_LOGICAL)
> > > - InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_LEVEL,
> > > -    0, InvalidOid,
> > > -    InvalidTransactionId);
> > > -
> > >   LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
> > >   ControlFile->MaxConnections = xlrec.MaxConnections;
> > >   ControlFile->max_worker_processes = xlrec.max_worker_processes;
> > > @@ -8605,6 +8643,50 @@ xlog_redo(XLogReaderState *record)
> > >   {
> > >   /* nothing to do here, just for informational purposes */
> > >   }
> > > + else if (info == XLOG_LOGICAL_DECODING_STATUS_CHANGE)
> > > + {
> > > + bool logical_decoding;
> > > +
> > > + /* Update the status on shared memory */
> > > + memcpy(&logical_decoding, XLogRecGetData(record), sizeof(bool));
> > > + UpdateLogicalDecodingStatus(logical_decoding, true);
> > > +
> > > + if (InRecovery && InHotStandby)
> > > + {
> > > + if (!logical_decoding)
> > >
> > > Like previously, shouldn't we have a check for standby's wal_level as
> > > well due to the reasons mentioned in the removed comments?
> >
> > IIUC we need to replay the STATUS_CHANGE record when wal_level is set
> > to 'replica' or 'logical'. If we want to add a check for standby's
> > wal_level, the check would be "wal_level >= WAL_LEVEL_REPLICA" but it
> > would be redundant as we already checked "InRecovery && InHotStandby".
> >
>
> If we want to mimic the current implementation, won't
> effective_wal_level be 'logical' even on standby? Otherwise, there
> shouldn't be any logical slots which can be invalidated.

Yes, effective_wal_level should be logical on the standby in this
case. But when replaying STATUS_CHANGE with logical_decoding=false
(i.e., !logical_decoding), it's obvious that the previous
effective_wal_level was logical, no?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Fri, Sep 5, 2025 at 9:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sat, Sep 6, 2025 at 3:58 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Tue, Sep 2, 2025 at 5:12 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
> > >
> > >
> > > I tested the behaviour with HEAD and with Patch. And I confirmed the
> > > change in behaviour between HEAD and Patch
> > >
> > > Suppose we have a primary and a standby with wal_level = logical and
> > > guc parameters to enable slot sync worker are set accordingly. A slot
> > > sync worker will be running.
> > > Now we change the value of wal_level for primary to replica. And
> > > restart the primary server
> > >
> > > With HEAD, during restart the existing sync_slot_worker will exit with:
> > > 2025-09-02 11:49:08.846 IST [3877882] ERROR:  synchronization worker
> > > "" could not connect to the primary server: connection to server at
> > > "localhost" (127.0.0.1), port 5432 failed: Connection refused
> > > Is the server running on that host and accepting TCP/IP connections?
> > > 2025-09-02 11:49:11.380 IST [3877885] FATAL:  streaming replication
> > > receiver "walreceiver" could not connect to the primary server:
> > > connection to server at "localhost" (127.0.0.1), port 5432 failed:
> > > Connection refused
> > > Is the server running on that host and accepting TCP/IP connections?
> > >
> > > and after the restart of the primary server, slot sync worker will
> > > restart and it is able to connect to the primary.
> > >
> > > With Patch, during restart the existing sync_slot_worker will exit.
> > > But after the restart of the primary server, slot sync worker cannot
> > > start and we can see following log:
> > > 2025-09-02 12:44:51.497 IST [3947520] LOG:  replication slot
> > > synchronization worker is shutting down on receiving SIGINT
> > > 2025-09-02 12:44:51.498 IST [3943504] LOG:  replication slot
> > > synchronization requires logical decoding to be enabled
> > > 2025-09-02 12:44:51.498 IST [3943504] HINT:  To enable logical
> > > decoding on primary, set "wal_level" >= "logical" or create at least
> > > one logical slot when "wal_level" = "replica".
> > > 2025-09-02 12:45:51.537 IST [3943504] LOG:  replication slot
> > > synchronization requires logical decoding to be enabled
> > > 2025-09-02 12:45:51.537 IST [3943504] HINT:  To enable logical
> > > decoding on primary, set "wal_level" >= "logical" or create at least
> > > one logical slot when "wal_level" = "replica".
> > >
> > > So, with HEAD, after we restart the primary server with 'wal_level =
> > > replica', the slot sync worker can restart and connect to the primary
> > > but with patch it cannot start after restart due to the check in
> > > ValidateSlotSyncParams.
> >
> > But the slotsync worker is launched again once logical decoding is
> > enabled, no? I'm not sure that we want to launch the slotsync worker
> > also when we know logical decoding is not enabled.
> >
>
> Why in the first place the logical_decoding enabled check has failed
> because IIUC, the wal_level on standby is still 'logical'?

This is because logical decoding on standbys can be used only when the
standby's effective_wal_level is 'logical', which also means the
primary's effective_wal_level is 'logical' too. This behavior is
mostly the same as today; logical decoding on standbys can be used
only when both the primary and the standbys set wal_level to
'logical'. Even if standby's wal_level is set to logical, it doesn't
mean that incoming WAL records are generated on the primary with the
information required by logical decoding.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Amit Kapila
Дата:
On Mon, Sep 8, 2025 at 11:22 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Fri, Sep 5, 2025 at 9:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Sat, Sep 6, 2025 at 3:58 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Tue, Sep 2, 2025 at 5:12 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
> > > >
> > > >
> > > > I tested the behaviour with HEAD and with Patch. And I confirmed the
> > > > change in behaviour between HEAD and Patch
> > > >
> > > > Suppose we have a primary and a standby with wal_level = logical and
> > > > guc parameters to enable slot sync worker are set accordingly. A slot
> > > > sync worker will be running.
> > > > Now we change the value of wal_level for primary to replica. And
> > > > restart the primary server
> > > >
> > > > With HEAD, during restart the existing sync_slot_worker will exit with:
> > > > 2025-09-02 11:49:08.846 IST [3877882] ERROR:  synchronization worker
> > > > "" could not connect to the primary server: connection to server at
> > > > "localhost" (127.0.0.1), port 5432 failed: Connection refused
> > > > Is the server running on that host and accepting TCP/IP connections?
> > > > 2025-09-02 11:49:11.380 IST [3877885] FATAL:  streaming replication
> > > > receiver "walreceiver" could not connect to the primary server:
> > > > connection to server at "localhost" (127.0.0.1), port 5432 failed:
> > > > Connection refused
> > > > Is the server running on that host and accepting TCP/IP connections?
> > > >
> > > > and after the restart of the primary server, slot sync worker will
> > > > restart and it is able to connect to the primary.
> > > >
> > > > With Patch, during restart the existing sync_slot_worker will exit.
> > > > But after the restart of the primary server, slot sync worker cannot
> > > > start and we can see following log:
> > > > 2025-09-02 12:44:51.497 IST [3947520] LOG:  replication slot
> > > > synchronization worker is shutting down on receiving SIGINT
> > > > 2025-09-02 12:44:51.498 IST [3943504] LOG:  replication slot
> > > > synchronization requires logical decoding to be enabled
> > > > 2025-09-02 12:44:51.498 IST [3943504] HINT:  To enable logical
> > > > decoding on primary, set "wal_level" >= "logical" or create at least
> > > > one logical slot when "wal_level" = "replica".
> > > > 2025-09-02 12:45:51.537 IST [3943504] LOG:  replication slot
> > > > synchronization requires logical decoding to be enabled
> > > > 2025-09-02 12:45:51.537 IST [3943504] HINT:  To enable logical
> > > > decoding on primary, set "wal_level" >= "logical" or create at least
> > > > one logical slot when "wal_level" = "replica".
> > > >
> > > > So, with HEAD, after we restart the primary server with 'wal_level =
> > > > replica', the slot sync worker can restart and connect to the primary
> > > > but with patch it cannot start after restart due to the check in
> > > > ValidateSlotSyncParams.
> > >
> > > But the slotsync worker is launched again once logical decoding is
> > > enabled, no? I'm not sure that we want to launch the slotsync worker
> > > also when we know logical decoding is not enabled.
> > >
> >
> > Why in the first place the logical_decoding enabled check has failed
> > because IIUC, the wal_level on standby is still 'logical'?
>
> This is because logical decoding on standbys can be used only when the
> standby's effective_wal_level is 'logical', which also means the
> primary's effective_wal_level is 'logical' too. This behavior is
> mostly the same as today; logical decoding on standbys can be used
> only when both the primary and the standbys set wal_level to
> 'logical'. Even if standby's wal_level is set to logical, it doesn't
> mean that incoming WAL records are generated on the primary with the
> information required by logical decoding.
>

This is true but IIUC Shlok's report says that we are able to restart
server before patch and not after patch. Am, I missing something? If
not, then shouldn't this be fixed separately first?

--
With Regards,
Amit Kapila.



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Amit Kapila
Дата:
On Mon, Sep 8, 2025 at 11:14 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Fri, Sep 5, 2025 at 8:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Thu, Sep 4, 2025 at 1:24 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > 2.
> > > > - /*
> > > > - * Invalidate logical slots if we are in hot standby and the primary
> > > > - * does not have a WAL level sufficient for logical decoding. No need
> > > > - * to search for potentially conflicting logically slots if standby is
> > > > - * running with wal_level lower than logical, because in that case, we
> > > > - * would have either disallowed creation of logical slots or
> > > > - * invalidated existing ones.
> > > > - */
> > > > - if (InRecovery && InHotStandby &&
> > > > - xlrec.wal_level < WAL_LEVEL_LOGICAL &&
> > > > - wal_level >= WAL_LEVEL_LOGICAL)
> > > > - InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_LEVEL,
> > > > -    0, InvalidOid,
> > > > -    InvalidTransactionId);
> > > > -
> > > >   LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
> > > >   ControlFile->MaxConnections = xlrec.MaxConnections;
> > > >   ControlFile->max_worker_processes = xlrec.max_worker_processes;
> > > > @@ -8605,6 +8643,50 @@ xlog_redo(XLogReaderState *record)
> > > >   {
> > > >   /* nothing to do here, just for informational purposes */
> > > >   }
> > > > + else if (info == XLOG_LOGICAL_DECODING_STATUS_CHANGE)
> > > > + {
> > > > + bool logical_decoding;
> > > > +
> > > > + /* Update the status on shared memory */
> > > > + memcpy(&logical_decoding, XLogRecGetData(record), sizeof(bool));
> > > > + UpdateLogicalDecodingStatus(logical_decoding, true);
> > > > +
> > > > + if (InRecovery && InHotStandby)
> > > > + {
> > > > + if (!logical_decoding)
> > > >
> > > > Like previously, shouldn't we have a check for standby's wal_level as
> > > > well due to the reasons mentioned in the removed comments?
> > >
> > > IIUC we need to replay the STATUS_CHANGE record when wal_level is set
> > > to 'replica' or 'logical'. If we want to add a check for standby's
> > > wal_level, the check would be "wal_level >= WAL_LEVEL_REPLICA" but it
> > > would be redundant as we already checked "InRecovery && InHotStandby".
> > >
> >
> > If we want to mimic the current implementation, won't
> > effective_wal_level be 'logical' even on standby? Otherwise, there
> > shouldn't be any logical slots which can be invalidated.
>
> Yes, effective_wal_level should be logical on the standby in this
> case. But when replaying STATUS_CHANGE with logical_decoding=false
> (i.e., !logical_decoding), it's obvious that the previous
> effective_wal_level was logical, no?
>

Isn't it possible that when we are replaying STATUS_CHANGE with
logical_decoding=false, the standby already has effective_wal_level
lesser than 'logical'? If so, then in that case, we don't need to
attempt invalidating the slots.

--
With Regards,
Amit Kapila.



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Mon, Sep 8, 2025 at 11:22 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Sep 8, 2025 at 11:22 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Fri, Sep 5, 2025 at 9:12 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Sat, Sep 6, 2025 at 3:58 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Tue, Sep 2, 2025 at 5:12 AM Shlok Kyal <shlok.kyal.oss@gmail.com> wrote:
> > > > >
> > > > >
> > > > > I tested the behaviour with HEAD and with Patch. And I confirmed the
> > > > > change in behaviour between HEAD and Patch
> > > > >
> > > > > Suppose we have a primary and a standby with wal_level = logical and
> > > > > guc parameters to enable slot sync worker are set accordingly. A slot
> > > > > sync worker will be running.
> > > > > Now we change the value of wal_level for primary to replica. And
> > > > > restart the primary server
> > > > >
> > > > > With HEAD, during restart the existing sync_slot_worker will exit with:
> > > > > 2025-09-02 11:49:08.846 IST [3877882] ERROR:  synchronization worker
> > > > > "" could not connect to the primary server: connection to server at
> > > > > "localhost" (127.0.0.1), port 5432 failed: Connection refused
> > > > > Is the server running on that host and accepting TCP/IP connections?
> > > > > 2025-09-02 11:49:11.380 IST [3877885] FATAL:  streaming replication
> > > > > receiver "walreceiver" could not connect to the primary server:
> > > > > connection to server at "localhost" (127.0.0.1), port 5432 failed:
> > > > > Connection refused
> > > > > Is the server running on that host and accepting TCP/IP connections?
> > > > >
> > > > > and after the restart of the primary server, slot sync worker will
> > > > > restart and it is able to connect to the primary.
> > > > >
> > > > > With Patch, during restart the existing sync_slot_worker will exit.
> > > > > But after the restart of the primary server, slot sync worker cannot
> > > > > start and we can see following log:
> > > > > 2025-09-02 12:44:51.497 IST [3947520] LOG:  replication slot
> > > > > synchronization worker is shutting down on receiving SIGINT
> > > > > 2025-09-02 12:44:51.498 IST [3943504] LOG:  replication slot
> > > > > synchronization requires logical decoding to be enabled
> > > > > 2025-09-02 12:44:51.498 IST [3943504] HINT:  To enable logical
> > > > > decoding on primary, set "wal_level" >= "logical" or create at least
> > > > > one logical slot when "wal_level" = "replica".
> > > > > 2025-09-02 12:45:51.537 IST [3943504] LOG:  replication slot
> > > > > synchronization requires logical decoding to be enabled
> > > > > 2025-09-02 12:45:51.537 IST [3943504] HINT:  To enable logical
> > > > > decoding on primary, set "wal_level" >= "logical" or create at least
> > > > > one logical slot when "wal_level" = "replica".
> > > > >
> > > > > So, with HEAD, after we restart the primary server with 'wal_level =
> > > > > replica', the slot sync worker can restart and connect to the primary
> > > > > but with patch it cannot start after restart due to the check in
> > > > > ValidateSlotSyncParams.
> > > >
> > > > But the slotsync worker is launched again once logical decoding is
> > > > enabled, no? I'm not sure that we want to launch the slotsync worker
> > > > also when we know logical decoding is not enabled.
> > > >
> > >
> > > Why in the first place the logical_decoding enabled check has failed
> > > because IIUC, the wal_level on standby is still 'logical'?
> >
> > This is because logical decoding on standbys can be used only when the
> > standby's effective_wal_level is 'logical', which also means the
> > primary's effective_wal_level is 'logical' too. This behavior is
> > mostly the same as today; logical decoding on standbys can be used
> > only when both the primary and the standbys set wal_level to
> > 'logical'. Even if standby's wal_level is set to logical, it doesn't
> > mean that incoming WAL records are generated on the primary with the
> > information required by logical decoding.
> >
>
> This is true but IIUC Shlok's report says that we are able to restart
> server before patch and not after patch. Am, I missing something? If
> not, then shouldn't this be fixed separately first?

I've reread his report. IIUC what happened in his test scenario was;
while he was restarting the primary server (to make
wal_level='replica' effect), the slotsync worker exited due to a
connection error. Then after the primary started up, with the patch,
the slotsync worker was not launched again, whereas it was launched
again without the patch. This is because with the patch, the standby
disables the logical decoding when replaying the STATUS_CHANGE record.
If the primary enables logical decoding again, the STATUS_CHANGE
record with logical_decoding=true is replicated to the standby and it
launches the slotsync worker again. That is, the slotsync worker
launches based on the standby's effective_wal_level. On the other
hand, before the patch, the slotsync worker is launched solely based
on the standby's wal_level. Therefore, it launches but doesn't do
anything in this case (as the primary should not have any logical
slot). I thought it makes sense that we don't launch the slotsync
worker when effective_wal_level is 'replica', but is your suggestion
that the slotsync worker needs to be launched only when the standby's
wal_level is logical regardless of effective_wal_level?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Mon, Sep 8, 2025 at 11:34 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Sep 8, 2025 at 11:14 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Fri, Sep 5, 2025 at 8:50 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Thu, Sep 4, 2025 at 1:24 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > >
> > > > > 2.
> > > > > - /*
> > > > > - * Invalidate logical slots if we are in hot standby and the primary
> > > > > - * does not have a WAL level sufficient for logical decoding. No need
> > > > > - * to search for potentially conflicting logically slots if standby is
> > > > > - * running with wal_level lower than logical, because in that case, we
> > > > > - * would have either disallowed creation of logical slots or
> > > > > - * invalidated existing ones.
> > > > > - */
> > > > > - if (InRecovery && InHotStandby &&
> > > > > - xlrec.wal_level < WAL_LEVEL_LOGICAL &&
> > > > > - wal_level >= WAL_LEVEL_LOGICAL)
> > > > > - InvalidateObsoleteReplicationSlots(RS_INVAL_WAL_LEVEL,
> > > > > -    0, InvalidOid,
> > > > > -    InvalidTransactionId);
> > > > > -
> > > > >   LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
> > > > >   ControlFile->MaxConnections = xlrec.MaxConnections;
> > > > >   ControlFile->max_worker_processes = xlrec.max_worker_processes;
> > > > > @@ -8605,6 +8643,50 @@ xlog_redo(XLogReaderState *record)
> > > > >   {
> > > > >   /* nothing to do here, just for informational purposes */
> > > > >   }
> > > > > + else if (info == XLOG_LOGICAL_DECODING_STATUS_CHANGE)
> > > > > + {
> > > > > + bool logical_decoding;
> > > > > +
> > > > > + /* Update the status on shared memory */
> > > > > + memcpy(&logical_decoding, XLogRecGetData(record), sizeof(bool));
> > > > > + UpdateLogicalDecodingStatus(logical_decoding, true);
> > > > > +
> > > > > + if (InRecovery && InHotStandby)
> > > > > + {
> > > > > + if (!logical_decoding)
> > > > >
> > > > > Like previously, shouldn't we have a check for standby's wal_level as
> > > > > well due to the reasons mentioned in the removed comments?
> > > >
> > > > IIUC we need to replay the STATUS_CHANGE record when wal_level is set
> > > > to 'replica' or 'logical'. If we want to add a check for standby's
> > > > wal_level, the check would be "wal_level >= WAL_LEVEL_REPLICA" but it
> > > > would be redundant as we already checked "InRecovery && InHotStandby".
> > > >
> > >
> > > If we want to mimic the current implementation, won't
> > > effective_wal_level be 'logical' even on standby? Otherwise, there
> > > shouldn't be any logical slots which can be invalidated.
> >
> > Yes, effective_wal_level should be logical on the standby in this
> > case. But when replaying STATUS_CHANGE with logical_decoding=false
> > (i.e., !logical_decoding), it's obvious that the previous
> > effective_wal_level was logical, no?
> >
>
> Isn't it possible that when we are replaying STATUS_CHANGE with
> logical_decoding=false, the standby already has effective_wal_level
> lesser than 'logical'? If so, then in that case, we don't need to
> attempt invalidating the slots.

Since we write the STATUS_CHANGE with logical_decoding=false only when
logical decoding is enabled, I've not seen the case. But it's harmless
to check if logical decoding was enabled before trying to invalidate
logical slots when replaying STATUS_CHANGE with
logical_decoding=false. So I'll add it.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Amit Kapila
Дата:
On Wed, Sep 10, 2025 at 12:15 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> I've reread his report. IIUC what happened in his test scenario was;
> while he was restarting the primary server (to make
> wal_level='replica' effect), the slotsync worker exited due to a
> connection error. Then after the primary started up, with the patch,
> the slotsync worker was not launched again, whereas it was launched
> again without the patch. This is because with the patch, the standby
> disables the logical decoding when replaying the STATUS_CHANGE record.
> If the primary enables logical decoding again, the STATUS_CHANGE
> record with logical_decoding=true is replicated to the standby and it
> launches the slotsync worker again. That is, the slotsync worker
> launches based on the standby's effective_wal_level. On the other
> hand, before the patch, the slotsync worker is launched solely based
> on the standby's wal_level. Therefore, it launches but doesn't do
> anything in this case (as the primary should not have any logical
> slot). I thought it makes sense that we don't launch the slotsync
> worker when effective_wal_level is 'replica', but is your suggestion
> that the slotsync worker needs to be launched only when the standby's
> wal_level is logical regardless of effective_wal_level?
>

No, the patch's behavior is good. I was thinking whether we can change
it even for HEAD and then the patch's behavior will match with HEAD.
But I think that may not be as straight-forward because standby may
not have that information readily available. Anyway, if there is no
simple way to change HEAD's behavior then we can leave that as it is.

--
With Regards,
Amit Kapila.



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Amit Kapila
Дата:
On Sat, Sep 6, 2025 at 3:46 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> I've attached the updated patch that incorporated all comments I got so far.
>

*
+ /*
+ * While all processes are using the new status, there could be some
+ * transactions that might have started with the old status. So wait
+ * for the running transactions to complete so that logical decoding
+ * doesn't include transactions that wrote WAL with insufficient
+ * information.
+ */
+ running = GetRunningTransactionData();
+ LWLockRelease(ProcArrayLock);
+ LWLockRelease(XidGenLock);
+
+ elog(DEBUG1, "waiting for %d transactions to complete", running->xcnt);
+
+ for (int i = 0; i < running->xcnt; i++)
+ {
+ TransactionId xid = running->xids[i];
+
+ if (TransactionIdIsCurrentTransactionId(xid))
+ continue;
+
+ XactLockTableWait(xid, NULL, NULL, XLTW_None);
+ }

When building a snapshot during the start of logical decoding, we
anyway wait for running transactions to finish via the snapbuild
machinery. So, why do we need it here? And if it is needed, can we
update the comments to explain why it is required in spite of
snapbuild machinery doing similar thing?

* Is it a good idea to enable/disable decoding for temporary logical
slots? The temporary slots are released during ERROR or at session
end, is that a good time to do the disable processing that even
requires WAL writing.

--
With Regards,
Amit Kapila.



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Wed, Sep 10, 2025 at 11:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Sat, Sep 6, 2025 at 3:46 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > I've attached the updated patch that incorporated all comments I got so far.
> >
>
> *
> + /*
> + * While all processes are using the new status, there could be some
> + * transactions that might have started with the old status. So wait
> + * for the running transactions to complete so that logical decoding
> + * doesn't include transactions that wrote WAL with insufficient
> + * information.
> + */
> + running = GetRunningTransactionData();
> + LWLockRelease(ProcArrayLock);
> + LWLockRelease(XidGenLock);
> +
> + elog(DEBUG1, "waiting for %d transactions to complete", running->xcnt);
> +
> + for (int i = 0; i < running->xcnt; i++)
> + {
> + TransactionId xid = running->xids[i];
> +
> + if (TransactionIdIsCurrentTransactionId(xid))
> + continue;
> +
> + XactLockTableWait(xid, NULL, NULL, XLTW_None);
> + }
>
> When building a snapshot during the start of logical decoding, we
> anyway wait for running transactions to finish via the snapbuild
> machinery. So, why do we need it here? And if it is needed, can we
> update the comments to explain why it is required in spite of
> snapbuild machinery doing similar thing?

Fair point. I don't see any reason we need to wait here. Will remove this step.

> * Is it a good idea to enable/disable decoding for temporary logical
> slots? The temporary slots are released during ERROR or at session
> end, is that a good time to do the disable processing that even
> requires WAL writing.

I think the same is true for slots with RS_EPEMERAL state. Since it
could confuse users if automatic effective_wal_level change is
supported only for non-temporary slots, I personally would like not to
push aside temporary slots. I agree that it might not be a good time
to disable processing during process shutdown time; in addition to
requiring WAL record, it also requires waits for concurrent state
change processings while it holds all interrupts, which could easily
involve dead-locks. It might be worth considering doing the disable
process in a lazy way. For example, other processes (like
checkpointer) periodically checks the logical decoding status and
disables it if necessary.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Thu, Sep 11, 2025 at 10:46 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Sep 10, 2025 at 11:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Sat, Sep 6, 2025 at 3:46 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > I've attached the updated patch that incorporated all comments I got so far.
> > >
> >
> > *
> > + /*
> > + * While all processes are using the new status, there could be some
> > + * transactions that might have started with the old status. So wait
> > + * for the running transactions to complete so that logical decoding
> > + * doesn't include transactions that wrote WAL with insufficient
> > + * information.
> > + */
> > + running = GetRunningTransactionData();
> > + LWLockRelease(ProcArrayLock);
> > + LWLockRelease(XidGenLock);
> > +
> > + elog(DEBUG1, "waiting for %d transactions to complete", running->xcnt);
> > +
> > + for (int i = 0; i < running->xcnt; i++)
> > + {
> > + TransactionId xid = running->xids[i];
> > +
> > + if (TransactionIdIsCurrentTransactionId(xid))
> > + continue;
> > +
> > + XactLockTableWait(xid, NULL, NULL, XLTW_None);
> > + }
> >
> > When building a snapshot during the start of logical decoding, we
> > anyway wait for running transactions to finish via the snapbuild
> > machinery. So, why do we need it here? And if it is needed, can we
> > update the comments to explain why it is required in spite of
> > snapbuild machinery doing similar thing?
>
> Fair point. I don't see any reason we need to wait here. Will remove this step.
>
> > * Is it a good idea to enable/disable decoding for temporary logical
> > slots? The temporary slots are released during ERROR or at session
> > end, is that a good time to do the disable processing that even
> > requires WAL writing.
>
> I think the same is true for slots with RS_EPEMERAL state.

Just to be clear, I meant a case like where one logical slot is
already present and the slot is removed between when another newly
created logical slot is created with RS_EPHEMERAL state and removed
due to an error. In this case, the ephemeral slot is the last logical
replication slot to drop.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Amit Kapila
Дата:
On Thu, Sep 11, 2025 at 11:16 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Sep 10, 2025 at 11:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Sat, Sep 6, 2025 at 3:46 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > I've attached the updated patch that incorporated all comments I got so far.
> > >
> >
> > *
> > + /*
> > + * While all processes are using the new status, there could be some
> > + * transactions that might have started with the old status. So wait
> > + * for the running transactions to complete so that logical decoding
> > + * doesn't include transactions that wrote WAL with insufficient
> > + * information.
> > + */
> > + running = GetRunningTransactionData();
> > + LWLockRelease(ProcArrayLock);
> > + LWLockRelease(XidGenLock);
> > +
> > + elog(DEBUG1, "waiting for %d transactions to complete", running->xcnt);
> > +
> > + for (int i = 0; i < running->xcnt; i++)
> > + {
> > + TransactionId xid = running->xids[i];
> > +
> > + if (TransactionIdIsCurrentTransactionId(xid))
> > + continue;
> > +
> > + XactLockTableWait(xid, NULL, NULL, XLTW_None);
> > + }
> >
> > When building a snapshot during the start of logical decoding, we
> > anyway wait for running transactions to finish via the snapbuild
> > machinery. So, why do we need it here? And if it is needed, can we
> > update the comments to explain why it is required in spite of
> > snapbuild machinery doing similar thing?
>
> Fair point. I don't see any reason we need to wait here. Will remove this step.
>

We can add a comment there explaining why we don't wait for
in-progress transactions. This will also be important if we miss
anything and later need to handle it similarly.

One thing related to this which needs a discussion is after this
change, it is possible that part of the transaction contains
additional logical_wal_info. I couldn't think of a problem due to this
but users using pg_waldump or other WAL reading utilities could
question this. One possibility is that we always start including
logical_wal_info for the next new transaction but not sure if that is
required. It would be good if other people involved in the discussion
or otherwise could share their opinion on this point.

> > * Is it a good idea to enable/disable decoding for temporary logical
> > slots? The temporary slots are released during ERROR or at session
> > end, is that a good time to do the disable processing that even
> > requires WAL writing.
>
> I think the same is true for slots with RS_EPEMERAL state. Since it
> could confuse users if automatic effective_wal_level change is
> supported only for non-temporary slots, I personally would like not to
> push aside temporary slots. I agree that it might not be a good time
> to disable processing during process shutdown time; in addition to
> requiring WAL record, it also requires waits for concurrent state
> change processings while it holds all interrupts, which could easily
> involve dead-locks.
>

Yes, all such processing during ERROR and shutdown sounds scary and a
source for problems.

> It might be worth considering doing the disable
> process in a lazy way. For example, other processes (like
> checkpointer) periodically checks the logical decoding status and
> disables it if necessary.
>

Yeah, doing lazily sounds reasonable to me. We need to do lazily only
for ERROR cases, otherwise, during a normal drop_slot, it may be okay.
But OTOH, while dropping the slot as a part of subscription drop, it
could be risky because if due to any reason, the disabling took more
time, the subscription drop operation would look like hang or in worse
case, the connection can time out.

For the shutdown sequence, can't we think of resetting effective_wal
after a restart?

--
With Regards,
Amit Kapila.



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Ashutosh Bapat
Дата:
On Fri, Sep 12, 2025 at 9:38 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
>
> One thing related to this which needs a discussion is after this
> change, it is possible that part of the transaction contains
> additional logical_wal_info. I couldn't think of a problem due to this
> but users using pg_waldump or other WAL reading utilities could
> question this. One possibility is that we always start including
> logical_wal_info for the next new transaction but not sure if that is
> required. It would be good if other people involved in the discussion
> or otherwise could share their opinion on this point.
>

AFAIR, logical info is a separate section in a WAL record, and there
is not marker which says "WAL will contain logical info henceforth".
So the utilities should be checking for the existence of such info
before reading it. So I guess it should be ok. Some extra sensitive
utilities may expect that once a WAL record has logical info, all the
succeeding WAL records will have it. They may find it troublesome that
WAL records with and without logical info are interleaved. Generally,
I would prefer that presence/absence of logical info changes at
transaction boundaries, but we will still have interleaving WAL
records. So I doubt how much that matters.

Sorry for jumping late in the discussion. I have a few comments,
mostly superficial ones. I am yet to take a deeper look at the
synchronization logic.

<sect2 id="logicaldecoding-replication-slots">
@@ -328,8 +362,7 @@ postgres=# select * from
pg_logical_slot_get_changes('regression_slot', NULL, NU
that could be needed by the logical decoding on the standby (as it does
not know about the <literal>catalog_xmin</literal> on the standby).
Existing logical slots on standby also get invalidated if
- <varname>wal_level</varname> on the primary is reduced to less than
- <literal>logical</literal>.
+ logical decoding becomes disabled on the primary.

s/becomes disabled/is disabled/ or /gets disabled/. Given that logical
decoding can be disabled in multiple ways, it's better to add a
reference here to a section which explains what disabling logical
decoding means.

<listitem>
<para>
<literal>wal_level_insufficient</literal> means that the
- primary doesn't have a <xref linkend="guc-wal-level"/> sufficient to
- perform logical decoding. It is set only for logical slots.
+ logical decoding is disabled on primary due to insufficient
+ <xref linkend="guc-wal-level"/> or no logical slots. It is set only
+ for logical slots.

It may not be apparent to the users that insufficient wal_level means
'minimal' here. It will be better if we just mention logical decoding
is disabled on primary and refer to a section which explains what
disabling logical decoding means.

*
* Skip this if we're taking a full-page image of the new page, as we
* don't include the new tuple in the WAL record in that case. Also
- * disable if wal_level='logical', as logical decoding needs to be able to
- * read the new tuple in whole from the WAL record alone.
+ * disable if logical decoding is enabled, as logical decoding needs to be
+ * able to read the new tuple in whole from the WAL record alone.
*/

Not fault of this patch, but I find the comment to be slighly out of
sync with the code. The code actually checks whether the logical
decoding is enabled and whether the relation requires WAL to be logged
for logical decoding. The difference is subtle, but it did cause me a
bit of confusion when I read the code. Please consider rephrasing the
comment while you are modifying it.
if (oldbuf == newbuf && !need_tuple_data &&
!XLogCheckBufferNeedsBackup(newbuf))
@@ -9057,8 +9057,8 @@ log_heap_update(Relation reln, Buffer oldbuf,
/*
* Perform XLogInsert of an XLOG_HEAP2_NEW_CID record
*
- * This is only used in wal_level >= WAL_LEVEL_LOGICAL, and only for catalog
- * tuples.
+ * This is only used when effective WAL level is logical, and only for

Given that the earlier comment used GUC name, it's better to use the
GUC name "effective_wal_level" here.


case XLOG_PARAMETER_CHANGE:
+
+ /*
+ * Even if wal_level on the primary got decreased to 'replica' it
+ * doesn't necessarily mean to disable the logical decoding as
+ * long as we have at least one logical slot. So we don't check
+ * the logical decoding availability here but do in
+ * XLOG_LOGICAL_DECODING_STATUS_CHANGE case.
+ */

The earlier code checked for wal_level < WAL_LEVEL_LOGICAL, which
includes the case when wal_level is 'minimal'. I don't see the new
code handling 'minimal' case here. Am I missing something? Do we need
a comment here which specifically mentions "minimal" case

grammar "Even if wal_level on the primary was lowered to 'replica', as
long as there is at least one logical slot, the logical decoding
remains enabled. ... " also ... do so in ... .

+ * The module maintains separate controls of two aspects: writing information
+ * required by logical decoding to WAL records and utilizing logical decoding
+ * itself, controlled by LogicalDecodingCtl->xlog_logical_info and
+ * ->logical_decoding_enabled fields respectively. The activation process of
+ * logical decoding involves several steps, beginning with maintaining logical
+ * decoding in a disabled state while incrementing the effective WAL level to
+ * its 'logical' equivalent. This change is reflected in the read-only
+ * effective_wal_level GUC parameter. The process includes necessary
+ * synchronization to ensure all processes adapt to the new effective WAL
+ * level before logical decoding is fully enabled. Deactivation follows a
+ * similarly careful, multi-step process in the reverse order.
+ *

I was expecting that the step-by-step process would be described in a
README. But given that we have this detailed comment here, it may be
good to have the step-by-step process described here itself.

I see some comments use "effective_wal_level is logical" and some
mention "logical decoding is enabled" depending upon the context. I
think, it's important to differentiate between these two given that
the it's possible to find the system in a state where
effective_wal_level is 'logical' but logical decoding is disabled. But
as a first reader, this did cause me some confusion. Above paragraph
is a good place to make that distinction clear. The description of
step-by-step process will make things clearer.

+ /* cannot change while ReplicationSlotCtlLock is held */
+ if (!s->in_use)
+ continue;
+
+ /* NB: intentionally counting invalidated slots */

Explain the reason in the comment.

+
+# Cleanup all existing slots and start the concurrency test.
+$primary->safe_psql('postgres',
+ qq[select pg_drop_replication_slot('test_slot')]);

We should add effective wal level test. At a later point we have a
test for this but it would be good to make sure that the effective wal
level is replica at this stage in the test as well.

test_wal_level($primary, "replica|replica", "effective_wal_level reset
to 'replica' after dropping all logical slots");

+
+# Wait for the logical slot 'test_slot' has been created.

Should be "Wait for the logical slot 'test_slot' to be created" or
"Check that the logical slot 'test_slot' has been created".

+# Check if the standby's effective_wal_level should be 'logical' in spite
+# of wal_level being 'replica'.
+test_wal_level($standby1, "replica|logical",
+ "effective_wal_level='logical' on standby");

Do we have a test to verify that a logical replication slot can not be
created on a standby whose primary does *not* have effective_wal_level
'logical'?

+
+# Create a logical slot on the standby, which should be succeeded

grammar: ..., which should succeed OR better "Creating a logical slot
on standby should succeed".

+# as the primary enables it.

as the primary has logical decoding enabled.

+# Check if the logical decoding is not enabled on the standby4.
+test_wal_level($standby4, "logical|replica",
+ "standby's effective_wal_level got decreased to 'replica'");
+$standby4->safe_psql('postgres',
+ qq[select pg_drop_replication_slot('standby4_slot')]);
+

Instead of dropping the slot, if we create another slot on the
primary, what happens to the invalidated replication slot? Does it
remain invalidated? Have we covered this scenario in tests?

+
+done_testing();

What happens if we create a logical slot when wal_level is 'minimal'?
Do we have a test for that?

--
Best Wishes,
Ashutosh Bapat



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Thu, Sep 11, 2025 at 9:08 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Thu, Sep 11, 2025 at 11:16 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Wed, Sep 10, 2025 at 11:32 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Sat, Sep 6, 2025 at 3:46 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > I've attached the updated patch that incorporated all comments I got so far.
> > > >
> > >
> > > *
> > > + /*
> > > + * While all processes are using the new status, there could be some
> > > + * transactions that might have started with the old status. So wait
> > > + * for the running transactions to complete so that logical decoding
> > > + * doesn't include transactions that wrote WAL with insufficient
> > > + * information.
> > > + */
> > > + running = GetRunningTransactionData();
> > > + LWLockRelease(ProcArrayLock);
> > > + LWLockRelease(XidGenLock);
> > > +
> > > + elog(DEBUG1, "waiting for %d transactions to complete", running->xcnt);
> > > +
> > > + for (int i = 0; i < running->xcnt; i++)
> > > + {
> > > + TransactionId xid = running->xids[i];
> > > +
> > > + if (TransactionIdIsCurrentTransactionId(xid))
> > > + continue;
> > > +
> > > + XactLockTableWait(xid, NULL, NULL, XLTW_None);
> > > + }
> > >
> > > When building a snapshot during the start of logical decoding, we
> > > anyway wait for running transactions to finish via the snapbuild
> > > machinery. So, why do we need it here? And if it is needed, can we
> > > update the comments to explain why it is required in spite of
> > > snapbuild machinery doing similar thing?
> >
> > Fair point. I don't see any reason we need to wait here. Will remove this step.
> >
>
> We can add a comment there explaining why we don't wait for
> in-progress transactions. This will also be important if we miss
> anything and later need to handle it similarly.

Yes, I'll add comments.

>
> One thing related to this which needs a discussion is after this
> change, it is possible that part of the transaction contains
> additional logical_wal_info. I couldn't think of a problem due to this
> but users using pg_waldump or other WAL reading utilities could
> question this. One possibility is that we always start including
> logical_wal_info for the next new transaction but not sure if that is
> required. It would be good if other people involved in the discussion
> or otherwise could share their opinion on this point.

I believe it's safe to write logical information to WAL records even
when not strictly required, and it  won't be a problem in practice.
FYI a similar thing is true for full page writes; full page writes
could be included in WAL records even when not strictly required (see
UpdateFullPageWrites() for details).

>
> > > * Is it a good idea to enable/disable decoding for temporary logical
> > > slots? The temporary slots are released during ERROR or at session
> > > end, is that a good time to do the disable processing that even
> > > requires WAL writing.
> >
> > I think the same is true for slots with RS_EPEMERAL state. Since it
> > could confuse users if automatic effective_wal_level change is
> > supported only for non-temporary slots, I personally would like not to
> > push aside temporary slots. I agree that it might not be a good time
> > to disable processing during process shutdown time; in addition to
> > requiring WAL record, it also requires waits for concurrent state
> > change processings while it holds all interrupts, which could easily
> > involve dead-locks.
> >
>
> Yes, all such processing during ERROR and shutdown sounds scary and a
> source for problems.
>
> > It might be worth considering doing the disable
> > process in a lazy way. For example, other processes (like
> > checkpointer) periodically checks the logical decoding status and
> > disables it if necessary.
> >
>
> Yeah, doing lazily sounds reasonable to me. We need to do lazily only
> for ERROR cases, otherwise, during a normal drop_slot, it may be okay.
> But OTOH, while dropping the slot as a part of subscription drop, it
> could be risky because if due to any reason, the disabling took more
> time, the subscription drop operation would look like hang or in worse
> case, the connection can time out.

True. I thought that disabling logical decoding in a synchronous way
is more preferable for users since it's guaranteed effective_wal_level
gets decreased to 'replica' when drop-slot completes. However, one
hypothesis is that users would not be interested in whether
effective_wal_level is 'replica' or 'logical' but in being able to
create logical slots even when wal_level is set to 'logical'. That is,
if we use the lazy disabling approach for all cases, users would have
to wait for effective_wal_level to be decreased to 'replica' if they
want to check. But if users don't check that often in practice, the
lazy approach would be a better way.

> For the shutdown sequence, can't we think of resetting effective_wal
> after a restart?

Does it mean that effective_wal_level keeps 'logical' until the next
server starts?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Amit Kapila
Дата:
On Fri, Sep 12, 2025 at 11:18 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Sep 11, 2025 at 9:08 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
>
> > For the shutdown sequence, can't we think of resetting effective_wal
> > after a restart?
>
> Does it mean that effective_wal_level keeps 'logical' until the next
> server starts?
>

Yes, IIUC, effective_wal_level is anyway a derived value based on
current wal_level and presence of logical slots. So, what will be the
impact if it is not accurate at shutdown?

--
With Regards,
Amit Kapila.



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Sun, Sep 14, 2025 at 7:55 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Sep 12, 2025 at 11:18 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Thu, Sep 11, 2025 at 9:08 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> >
> > > For the shutdown sequence, can't we think of resetting effective_wal
> > > after a restart?
> >
> > Does it mean that effective_wal_level keeps 'logical' until the next
> > server starts?
> >
>
> Yes, IIUC, effective_wal_level is anyway a derived value based on
> current wal_level and presence of logical slots. So, what will be the
> impact if it is not accurate at shutdown?

I think there won't be an impact at shutdown time. I would rather be
concerned that such behavior could confuse users. I think it would not
be a rare situation where users enable and disable logical decoding by
creating and dropping a temporary slot. If we keep effective_wal_level
'logical' in this case, users would want to somehow disable logical
decoding as it could have a negative performance impact. There would
be two ways for users to change it to 'replica': restart the server or
create and drop a logical slot again. On the other hand, for users who
dropped a non-temporary logical slot without an error or dropped the
non-last temporary slot, logical decoding is disabled without other
manual interventions. It could be pretty hard to assess the situation,
resulting in having users always checking effective_wal_level after
dropping a logical slot and doing extra steps to make the
effective_wal_level 'replica'.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Amit Kapila
Дата:
On Mon, Sep 15, 2025 at 10:15 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Sun, Sep 14, 2025 at 7:55 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Fri, Sep 12, 2025 at 11:18 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Thu, Sep 11, 2025 at 9:08 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > >
> > > > For the shutdown sequence, can't we think of resetting effective_wal
> > > > after a restart?
> > >
> > > Does it mean that effective_wal_level keeps 'logical' until the next
> > > server starts?
> > >
> >
> > Yes, IIUC, effective_wal_level is anyway a derived value based on
> > current wal_level and presence of logical slots. So, what will be the
> > impact if it is not accurate at shutdown?
>
> I think there won't be an impact at shutdown time. I would rather be
> concerned that such behavior could confuse users. I think it would not
> be a rare situation where users enable and disable logical decoding by
> creating and dropping a temporary slot. If we keep effective_wal_level
> 'logical' in this case, users would want to somehow disable logical
> decoding as it could have a negative performance impact.
>

When user is dropping a temporary slot, we should disable the
decoding. The lazy behaviour should be for ERROR or session_exit
cases.

> There would
> be two ways for users to change it to 'replica': restart the server or
> create and drop a logical slot again.
>

If we do the lazy work during the checkpoint then they can perform the
checkpoint command.

 On the other hand, for users who
> dropped a non-temporary logical slot without an error or dropped the
> non-last temporary slot, logical decoding is disabled without other
> manual interventions. It could be pretty hard to assess the situation,
> resulting in having users always checking effective_wal_level after
> dropping a logical slot and doing extra steps to make the
> effective_wal_level 'replica'.
>

When the last slot is dropped, anyway, users won't be able to perform
any decoding. Do you mean that they want to know whether logical_wal
is still being recorded? If so, then checking effective_wal_level
would be the way.

--
With Regards,
Amit Kapila.



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Tue, Sep 16, 2025 at 1:30 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Sep 15, 2025 at 10:15 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Sun, Sep 14, 2025 at 7:55 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > On Fri, Sep 12, 2025 at 11:18 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > > On Thu, Sep 11, 2025 at 9:08 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > >
> > > > > For the shutdown sequence, can't we think of resetting effective_wal
> > > > > after a restart?
> > > >
> > > > Does it mean that effective_wal_level keeps 'logical' until the next
> > > > server starts?
> > > >
> > >
> > > Yes, IIUC, effective_wal_level is anyway a derived value based on
> > > current wal_level and presence of logical slots. So, what will be the
> > > impact if it is not accurate at shutdown?
> >
> > I think there won't be an impact at shutdown time. I would rather be
> > concerned that such behavior could confuse users. I think it would not
> > be a rare situation where users enable and disable logical decoding by
> > creating and dropping a temporary slot. If we keep effective_wal_level
> > 'logical' in this case, users would want to somehow disable logical
> > decoding as it could have a negative performance impact.
> >
>
> When user is dropping a temporary slot, we should disable the
> decoding. The lazy behaviour should be for ERROR or session_exit
> cases.

I think it might be worth discussing whether to use lazy behavior in
all cases. There are several advantages:

- It mitigates the risk of connection timeouts during a logical slot
drop or a subscription drop.
- In scenarios involving frequent creation and deletion of logical
slots (such as during initial data synchronization), it could
potentially avoid the issue of a frequent switch on and off.

On the other hand, drawbacks are:

- users would have to wait for effective_wal_level to get decreased to
'replica' somehow.
- makes the checkpointer more busy in addition to its checkpointing job.
- it could take a longer time to disable logical decoding if the
checkpoint is busy with a checkpointing job.

What do you think?

>
> > There would
> > be two ways for users to change it to 'replica': restart the server or
> > create and drop a logical slot again.
> >
>
> If we do the lazy work during the checkpoint then they can perform the
> checkpoint command.

Right.

>
>  On the other hand, for users who
> > dropped a non-temporary logical slot without an error or dropped the
> > non-last temporary slot, logical decoding is disabled without other
> > manual interventions. It could be pretty hard to assess the situation,
> > resulting in having users always checking effective_wal_level after
> > dropping a logical slot and doing extra steps to make the
> > effective_wal_level 'replica'.
> >
>
> When the last slot is dropped, anyway, users won't be able to perform
> any decoding. Do you mean that they want to know whether logical_wal
> is still being recorded? If so, then checking effective_wal_level
> would be the way.

I think the situation that users would want to avoid is that the
logical decoding is enabled (therefore writing logical_wal) even when
they don't want to use logical decoding because it means the system is
paying unnecessary costs in terms of writing logical_wal. It would not
be a problem if we can ensure that logical decoding is eventually
disabled in a reasonably short time in any case using lazy behavior.
On the other hand, I think it would not be a good user experience if
it's required for users to restart the server or do other manual
interventions in some specific scenarios in order to disable logical
decoding.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Amit Kapila
Дата:
On Tue, Sep 16, 2025 at 11:49 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, Sep 16, 2025 at 1:30 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > When user is dropping a temporary slot, we should disable the
> > decoding. The lazy behaviour should be for ERROR or session_exit
> > cases.
>
> I think it might be worth discussing whether to use lazy behavior in
> all cases.
>

Agreed.

> There are several advantages:
>
> - It mitigates the risk of connection timeouts during a logical slot
> drop or a subscription drop.
> - In scenarios involving frequent creation and deletion of logical
> slots (such as during initial data synchronization), it could
> potentially avoid the issue of a frequent switch on and off.
>
> On the other hand, drawbacks are:
>
> - users would have to wait for effective_wal_level to get decreased to
> 'replica' somehow.
> - makes the checkpointer more busy in addition to its checkpointing job.
> - it could take a longer time to disable logical decoding if the
> checkpoint is busy with a checkpointing job.
>

This last point in drawback could hurt performance of systems for a
longer time when that was really not required. It should be okay to
use lazy behavior in all cases when we can do that in a predictable
time. The other background process to consider doing lazy processing
is the launcher whose role is to launch apply workers for subscription
and maintain a conflict_slot (if required). Now, because disabling
logical_info could also take longer time in worst cases, the
launcher's own tasks can become unpredictable. Also, if tomorrow, we
decide to support dynamically changing wal_level from minimal to some
upper level, the launcher won't be the appropriate process.

The other idea could be to have a new auxiliary process to disable
logical_info lazily. It is arguable if we just have a separate process
for this purpose but we have previously discussed some other tasks for
such a process like removal of old_serialized_snapshots and
old_logical_ rewrite_map files. See [1]. If we agree to have a
separate process for this purpose then disabling logical_info in all
cases sounds okay to me.

[1] - https://www.postgresql.org/message-id/20230217234344.GA3357392%40nathanxps13

--
With Regards,
Amit Kapila.



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Wed, Sep 17, 2025 at 4:19 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Tue, Sep 16, 2025 at 11:49 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Tue, Sep 16, 2025 at 1:30 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > > When user is dropping a temporary slot, we should disable the
> > > decoding. The lazy behaviour should be for ERROR or session_exit
> > > cases.
> >
> > I think it might be worth discussing whether to use lazy behavior in
> > all cases.
> >
>
> Agreed.
>
> > There are several advantages:
> >
> > - It mitigates the risk of connection timeouts during a logical slot
> > drop or a subscription drop.
> > - In scenarios involving frequent creation and deletion of logical
> > slots (such as during initial data synchronization), it could
> > potentially avoid the issue of a frequent switch on and off.
> >
> > On the other hand, drawbacks are:
> >
> > - users would have to wait for effective_wal_level to get decreased to
> > 'replica' somehow.
> > - makes the checkpointer more busy in addition to its checkpointing job.
> > - it could take a longer time to disable logical decoding if the
> > checkpoint is busy with a checkpointing job.
> >
>
> This last point in drawback could hurt performance of systems for a
> longer time when that was really not required. It should be okay to
> use lazy behavior in all cases when we can do that in a predictable
> time.

Agreed.

If we use the lazy behavior in ERROR or session_exit cases, we would
have these drawbacks anyway. But assuming it won't happen frequently
in practice, we can live with that.

> The other background process to consider doing lazy processing
> is the launcher whose role is to launch apply workers for subscription
> and maintain a conflict_slot (if required). Now, because disabling
> logical_info could also take longer time in worst cases, the
> launcher's own tasks can become unpredictable. Also, if tomorrow, we
> decide to support dynamically changing wal_level from minimal to some
> upper level, the launcher won't be the appropriate process.

Right. Also, we don't launch the launcher process when
max_logical_replication_workers == 0. It should be >0 on the
subscriber but might not be on the publisher.

>
> The other idea could be to have a new auxiliary process to disable
> logical_info lazily. It is arguable if we just have a separate process
> for this purpose but we have previously discussed some other tasks for
> such a process like removal of old_serialized_snapshots and
> old_logical_ rewrite_map files. See [1]. If we agree to have a
> separate process for this purpose then disabling logical_info in all
> cases sounds okay to me.

Yeah, the custodian worker would be one solution. But please refer to
subsequent discussions[1][2]; there might not be other tasks to
delegate to the custodian worker than this logical decoding
deactivation, and it might be not optimal to have a single worker that
is responsible for all custodian works. Actually we've discussed a
similar idea on this thread and I drafted a patch[3] that utilizes
bgworkers to do internal tasks in the background in a
one-task-per-one-worker manner.

It requires more discussion anyway if we want to go with this
direction. I think we can start with using lazy behavior in ERROR or
session_exit cases (assuming it won't happen frequently in practice),
and consider using lazy behavior other cases if it's really
preferable.

Regards,

[1] https://www.postgresql.org/message-id/1058306.1680467858%40sss.pgh.pa.us
[2] https://www.postgresql.org/message-id/20230402184226.kkjplqvqu6utvzbt%40awork3.anarazel.de
[3] https://www.postgresql.org/message-id/CAD21AoCPc%2BpEgb0pJeiS2CU39ad8VW-10Ze7Uii%3D1RRjfgQ0uw%40mail.gmail.com

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Amit Kapila
Дата:
On Wed, Sep 17, 2025 at 10:24 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Wed, Sep 17, 2025 at 4:19 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> > On Tue, Sep 16, 2025 at 11:49 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Tue, Sep 16, 2025 at 1:30 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > > When user is dropping a temporary slot, we should disable the
> > > > decoding. The lazy behaviour should be for ERROR or session_exit
> > > > cases.
> > >
> > > I think it might be worth discussing whether to use lazy behavior in
> > > all cases.
> > >
> >
> > Agreed.
> >
> > > There are several advantages:
> > >
> > > - It mitigates the risk of connection timeouts during a logical slot
> > > drop or a subscription drop.
> > > - In scenarios involving frequent creation and deletion of logical
> > > slots (such as during initial data synchronization), it could
> > > potentially avoid the issue of a frequent switch on and off.
> > >
> > > On the other hand, drawbacks are:
> > >
> > > - users would have to wait for effective_wal_level to get decreased to
> > > 'replica' somehow.
> > > - makes the checkpointer more busy in addition to its checkpointing job.
> > > - it could take a longer time to disable logical decoding if the
> > > checkpoint is busy with a checkpointing job.
> > >
> >
> > This last point in drawback could hurt performance of systems for a
> > longer time when that was really not required. It should be okay to
> > use lazy behavior in all cases when we can do that in a predictable
> > time.
>
> Agreed.
>
> If we use the lazy behavior in ERROR or session_exit cases, we would
> have these drawbacks anyway. But assuming it won't happen frequently
> in practice, we can live with that.
>
> > The other background process to consider doing lazy processing
> > is the launcher whose role is to launch apply workers for subscription
> > and maintain a conflict_slot (if required). Now, because disabling
> > logical_info could also take longer time in worst cases, the
> > launcher's own tasks can become unpredictable. Also, if tomorrow, we
> > decide to support dynamically changing wal_level from minimal to some
> > upper level, the launcher won't be the appropriate process.
>
> Right. Also, we don't launch the launcher process when
> max_logical_replication_workers == 0. It should be >0 on the
> subscriber but might not be on the publisher.
>
> >
> > The other idea could be to have a new auxiliary process to disable
> > logical_info lazily. It is arguable if we just have a separate process
> > for this purpose but we have previously discussed some other tasks for
> > such a process like removal of old_serialized_snapshots and
> > old_logical_ rewrite_map files. See [1]. If we agree to have a
> > separate process for this purpose then disabling logical_info in all
> > cases sounds okay to me.
>
> Yeah, the custodian worker would be one solution. But please refer to
> subsequent discussions[1][2];
>

I think Tom's idea of spawning the worker on need basis has some use
here, like, during drop_slot, we can launch the worker to complete
this task and then exit to ameliorate the risk of connection_timeout
for drop subscription cases. However, we can consider such ideas as an
iterative improvements as well.

 there might not be other tasks to
> delegate to the custodian worker than this logical decoding
> deactivation, and it might be not optimal to have a single worker that
> is responsible for all custodian works. Actually we've discussed a
> similar idea on this thread and I drafted a patch[3] that utilizes
> bgworkers to do internal tasks in the background in a
> one-task-per-one-worker manner.
>
> It requires more discussion anyway if we want to go with this
> direction. I think we can start with using lazy behavior in ERROR or
> session_exit cases (assuming it won't happen frequently in practice),
> and consider using lazy behavior other cases if it's really
> preferable.
>

Fair enough. So, let's proceed with this plan (use lazy behavior in
ERROR and session_exit cases) and see how it works. BTW, we also need
to consider ERROR cases when the slot is dropped but we failed to
disable the logical_info due to any random ERROR.

--
With Regards,
Amit Kapila.



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Ashutosh Bapat
Дата:
On Fri, Sep 12, 2025 at 2:26 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Fri, Sep 12, 2025 at 9:38 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> >
> > One thing related to this which needs a discussion is after this
> > change, it is possible that part of the transaction contains
> > additional logical_wal_info. I couldn't think of a problem due to this
> > but users using pg_waldump or other WAL reading utilities could
> > question this. One possibility is that we always start including
> > logical_wal_info for the next new transaction but not sure if that is
> > required. It would be good if other people involved in the discussion
> > or otherwise could share their opinion on this point.
> >
>
> AFAIR, logical info is a separate section in a WAL record, and there
> is not marker which says "WAL will contain logical info henceforth".
> So the utilities should be checking for the existence of such info
> before reading it. So I guess it should be ok. Some extra sensitive
> utilities may expect that once a WAL record has logical info, all the
> succeeding WAL records will have it. They may find it troublesome that
> WAL records with and without logical info are interleaved. Generally,
> I would prefer that presence/absence of logical info changes at
> transaction boundaries, but we will still have interleaving WAL
> records. So I doubt how much that matters.
>
> Sorry for jumping late in the discussion. I have a few comments,
> mostly superficial ones. I am yet to take a deeper look at the
> synchronization logic.

I started looking at the synchronization logic but stumbled at

@@ -5100,6 +5139,7 @@ BootStrapXLOG(uint32 data_checksum_version)
checkPoint.ThisTimeLineID = BootstrapTimeLineID;
checkPoint.PrevTimeLineID = BootstrapTimeLineID;
checkPoint.fullPageWrites = fullPageWrites;
+ checkPoint.logicalDecodingEnabled = IsLogicalDecodingEnabled();

At the time of bootstrapping, logical decoding is solely dependent on
the boot_val of wal_level as there will not be any logical slots.
Above code however does not make this clear. If we were to change the
boot value of wal_level to logical this leads to a misleading
CHECKPOINT_SHUTDOWN record being added at the time of bootstrap like
below.
rmgr: XLOG len (rec/tot): 122/ 122, tx: 0, lsn: 0/01000028, prev
0/00000000, desc: CHECKPOINT_SHUTDOWN redo 0/01000028; tli 1; prev tli
1; fpw true; wal_level logical; logical decoding false; xid 0:3; oid
10000; multi 1; offset 0; oldest xid 3 in DB 1; oldest multi 1 in DB
1; oldest/newest commit timestamp xid: 0/0; oldest running xid 0;
shutdown

This soon gets corrected by the following WAL record
rmgr: XLOG len (rec/tot): 27/ 27, tx: 0, lsn: 0/010000A8, prev
0/01000028, desc: LOGICAL_DECODING_STATUS_CHANGE true

So beyond misleading a code reader or someone who is reading the WAL,
this does not have any functional impact. But maybe we should consider
making this a bit more clear by setting
checkPoint.logicalDecodingEnabled based on wal_level in
BootStrapXLOG(). Whether we change the code or not, I think we should
add a comment to explain this code.


--
Best Wishes,
Ashutosh Bapat



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Fri, Sep 12, 2025 at 1:56 AM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Fri, Sep 12, 2025 at 9:38 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> >
> > One thing related to this which needs a discussion is after this
> > change, it is possible that part of the transaction contains
> > additional logical_wal_info. I couldn't think of a problem due to this
> > but users using pg_waldump or other WAL reading utilities could
> > question this. One possibility is that we always start including
> > logical_wal_info for the next new transaction but not sure if that is
> > required. It would be good if other people involved in the discussion
> > or otherwise could share their opinion on this point.
> >
>
> AFAIR, logical info is a separate section in a WAL record, and there
> is not marker which says "WAL will contain logical info henceforth".
> So the utilities should be checking for the existence of such info
> before reading it. So I guess it should be ok. Some extra sensitive
> utilities may expect that once a WAL record has logical info, all the
> succeeding WAL records will have it. They may find it troublesome that
> WAL records with and without logical info are interleaved. Generally,
> I would prefer that presence/absence of logical info changes at
> transaction boundaries, but we will still have interleaving WAL
> records. So I doubt how much that matters.
>
> Sorry for jumping late in the discussion. I have a few comments,
> mostly superficial ones. I am yet to take a deeper look at the
> synchronization logic.

Thank you for reviewing the patch!

>
> <sect2 id="logicaldecoding-replication-slots">
> @@ -328,8 +362,7 @@ postgres=# select * from
> pg_logical_slot_get_changes('regression_slot', NULL, NU
> that could be needed by the logical decoding on the standby (as it does
> not know about the <literal>catalog_xmin</literal> on the standby).
> Existing logical slots on standby also get invalidated if
> - <varname>wal_level</varname> on the primary is reduced to less than
> - <literal>logical</literal>.
> + logical decoding becomes disabled on the primary.
>
> s/becomes disabled/is disabled/ or /gets disabled/. Given that logical
> decoding can be disabled in multiple ways, it's better to add a
> reference here to a section which explains what disabling logical
> decoding means.
>
> <listitem>
> <para>
> <literal>wal_level_insufficient</literal> means that the
> - primary doesn't have a <xref linkend="guc-wal-level"/> sufficient to
> - perform logical decoding. It is set only for logical slots.
> + logical decoding is disabled on primary due to insufficient
> + <xref linkend="guc-wal-level"/> or no logical slots. It is set only
> + for logical slots.
>
> It may not be apparent to the users that insufficient wal_level means
> 'minimal' here. It will be better if we just mention logical decoding
> is disabled on primary and refer to a section which explains what
> disabling logical decoding means.

Agreed with the above points.

>
> *
> * Skip this if we're taking a full-page image of the new page, as we
> * don't include the new tuple in the WAL record in that case. Also
> - * disable if wal_level='logical', as logical decoding needs to be able to
> - * read the new tuple in whole from the WAL record alone.
> + * disable if logical decoding is enabled, as logical decoding needs to be
> + * able to read the new tuple in whole from the WAL record alone.
> */
>
> Not fault of this patch, but I find the comment to be slighly out of
> sync with the code. The code actually checks whether the logical
> decoding is enabled and whether the relation requires WAL to be logged
> for logical decoding. The difference is subtle, but it did cause me a
> bit of confusion when I read the code. Please consider rephrasing the
> comment while you are modifying it.

Okay, I'll try to rephrasing it.

> if (oldbuf == newbuf && !need_tuple_data &&
> !XLogCheckBufferNeedsBackup(newbuf))
> @@ -9057,8 +9057,8 @@ log_heap_update(Relation reln, Buffer oldbuf,
> /*
> * Perform XLogInsert of an XLOG_HEAP2_NEW_CID record
> *
> - * This is only used in wal_level >= WAL_LEVEL_LOGICAL, and only for catalog
> - * tuples.
> + * This is only used when effective WAL level is logical, and only for
>
> Given that the earlier comment used GUC name, it's better to use the
> GUC name "effective_wal_level" here.

Will fix.

>
>
> case XLOG_PARAMETER_CHANGE:
> +
> + /*
> + * Even if wal_level on the primary got decreased to 'replica' it
> + * doesn't necessarily mean to disable the logical decoding as
> + * long as we have at least one logical slot. So we don't check
> + * the logical decoding availability here but do in
> + * XLOG_LOGICAL_DECODING_STATUS_CHANGE case.
> + */
>
> The earlier code checked for wal_level < WAL_LEVEL_LOGICAL, which
> includes the case when wal_level is 'minimal'. I don't see the new
> code handling 'minimal' case here. Am I missing something? Do we need
> a comment here which specifically mentions "minimal" case

Good point. I think it's not a problem as long as we write
STATUS_CHANGE record with logical_decoding=false before
PARAMETER_CHANGE record, but it would be more robust to handle the
case where wal_level gets decreased to 'minimal' or 'replica' there.

> grammar "Even if wal_level on the primary was lowered to 'replica', as
> long as there is at least one logical slot, the logical decoding
> remains enabled. ... " also ... do so in ... .

Will fix.

>
> + * The module maintains separate controls of two aspects: writing information
> + * required by logical decoding to WAL records and utilizing logical decoding
> + * itself, controlled by LogicalDecodingCtl->xlog_logical_info and
> + * ->logical_decoding_enabled fields respectively. The activation process of
> + * logical decoding involves several steps, beginning with maintaining logical
> + * decoding in a disabled state while incrementing the effective WAL level to
> + * its 'logical' equivalent. This change is reflected in the read-only
> + * effective_wal_level GUC parameter. The process includes necessary
> + * synchronization to ensure all processes adapt to the new effective WAL
> + * level before logical decoding is fully enabled. Deactivation follows a
> + * similarly careful, multi-step process in the reverse order.
> + *
>
> I was expecting that the step-by-step process would be described in a
> README. But given that we have this detailed comment here, it may be
> good to have the step-by-step process described here itself.
>
> I see some comments use "effective_wal_level is logical" and some
> mention "logical decoding is enabled" depending upon the context. I
> think, it's important to differentiate between these two given that
> the it's possible to find the system in a state where
> effective_wal_level is 'logical' but logical decoding is disabled. But
> as a first reader, this did cause me some confusion. Above paragraph
> is a good place to make that distinction clear. The description of
> step-by-step process will make things clearer.

Agreed. Will update the comments while considering these differences.

>
> + /* cannot change while ReplicationSlotCtlLock is held */
> + if (!s->in_use)
> + continue;
> +
> + /* NB: intentionally counting invalidated slots */
>
> Explain the reason in the comment.

Will add some comments to the function header comment.

>
> +
> +# Cleanup all existing slots and start the concurrency test.
> +$primary->safe_psql('postgres',
> + qq[select pg_drop_replication_slot('test_slot')]);
>
> We should add effective wal level test. At a later point we have a
> test for this but it would be good to make sure that the effective wal
> level is replica at this stage in the test as well.
>
> test_wal_level($primary, "replica|replica", "effective_wal_level reset
> to 'replica' after dropping all logical slots");
>
> +
> +# Wait for the logical slot 'test_slot' has been created.
>
> Should be "Wait for the logical slot 'test_slot' to be created" or
> "Check that the logical slot 'test_slot' has been created".

Will fix.

>
> +# Check if the standby's effective_wal_level should be 'logical' in spite
> +# of wal_level being 'replica'.
> +test_wal_level($standby1, "replica|logical",
> + "effective_wal_level='logical' on standby");
>
> Do we have a test to verify that a logical replication slot can not be
> created on a standby whose primary does *not* have effective_wal_level
> 'logical'?
>

I think no, so will add that test.

> +
> +# Create a logical slot on the standby, which should be succeeded
>
> grammar: ..., which should succeed OR better "Creating a logical slot
> on standby should succeed".
>
> +# as the primary enables it.
>
> as the primary has logical decoding enabled.

Will fix.

>
> +# Check if the logical decoding is not enabled on the standby4.
> +test_wal_level($standby4, "logical|replica",
> + "standby's effective_wal_level got decreased to 'replica'");
> +$standby4->safe_psql('postgres',
> + qq[select pg_drop_replication_slot('standby4_slot')]);
> +
>
> Instead of dropping the slot, if we create another slot on the
> primary, what happens to the invalidated replication slot? Does it
> remain invalidated? Have we covered this scenario in tests?

I believe that the slot remains invalidated. Adding such a test case
sounds good to me.

> +
> +done_testing();
>
> What happens if we create a logical slot when wal_level is 'minimal'?
> Do we have a test for that?

I'll consider more tests involving wal_level='minimal'.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Fri, Sep 19, 2025 at 7:45 AM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Fri, Sep 12, 2025 at 2:26 PM Ashutosh Bapat
> <ashutosh.bapat.oss@gmail.com> wrote:
> >
> > On Fri, Sep 12, 2025 at 9:38 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > >
> > >
> > > One thing related to this which needs a discussion is after this
> > > change, it is possible that part of the transaction contains
> > > additional logical_wal_info. I couldn't think of a problem due to this
> > > but users using pg_waldump or other WAL reading utilities could
> > > question this. One possibility is that we always start including
> > > logical_wal_info for the next new transaction but not sure if that is
> > > required. It would be good if other people involved in the discussion
> > > or otherwise could share their opinion on this point.
> > >
> >
> > AFAIR, logical info is a separate section in a WAL record, and there
> > is not marker which says "WAL will contain logical info henceforth".
> > So the utilities should be checking for the existence of such info
> > before reading it. So I guess it should be ok. Some extra sensitive
> > utilities may expect that once a WAL record has logical info, all the
> > succeeding WAL records will have it. They may find it troublesome that
> > WAL records with and without logical info are interleaved. Generally,
> > I would prefer that presence/absence of logical info changes at
> > transaction boundaries, but we will still have interleaving WAL
> > records. So I doubt how much that matters.
> >
> > Sorry for jumping late in the discussion. I have a few comments,
> > mostly superficial ones. I am yet to take a deeper look at the
> > synchronization logic.
>
> I started looking at the synchronization logic but stumbled at
>
> @@ -5100,6 +5139,7 @@ BootStrapXLOG(uint32 data_checksum_version)
> checkPoint.ThisTimeLineID = BootstrapTimeLineID;
> checkPoint.PrevTimeLineID = BootstrapTimeLineID;
> checkPoint.fullPageWrites = fullPageWrites;
> + checkPoint.logicalDecodingEnabled = IsLogicalDecodingEnabled();
>
> At the time of bootstrapping, logical decoding is solely dependent on
> the boot_val of wal_level as there will not be any logical slots.
> Above code however does not make this clear. If we were to change the
> boot value of wal_level to logical this leads to a misleading
> CHECKPOINT_SHUTDOWN record being added at the time of bootstrap like
> below.
> rmgr: XLOG len (rec/tot): 122/ 122, tx: 0, lsn: 0/01000028, prev
> 0/00000000, desc: CHECKPOINT_SHUTDOWN redo 0/01000028; tli 1; prev tli
> 1; fpw true; wal_level logical; logical decoding false; xid 0:3; oid
> 10000; multi 1; offset 0; oldest xid 3 in DB 1; oldest multi 1 in DB
> 1; oldest/newest commit timestamp xid: 0/0; oldest running xid 0;
> shutdown
>
> This soon gets corrected by the following WAL record
> rmgr: XLOG len (rec/tot): 27/ 27, tx: 0, lsn: 0/010000A8, prev
> 0/01000028, desc: LOGICAL_DECODING_STATUS_CHANGE true
>
> So beyond misleading a code reader or someone who is reading the WAL,
> this does not have any functional impact. But maybe we should consider
> making this a bit more clear by setting
> checkPoint.logicalDecodingEnabled based on wal_level in
> BootStrapXLOG(). Whether we change the code or not, I think we should
> add a comment to explain this code.

I agree that calling IsLogicalDecodingEnabled() in BootStrapXLOG()
could be quite confusing. I think we can directly set false there and
add some comments for those who try to change the default wal_level
value.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Ashutosh Bapat
Дата:
On Fri, Sep 19, 2025 at 10:49 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Fri, Sep 19, 2025 at 7:45 AM Ashutosh Bapat
> <ashutosh.bapat.oss@gmail.com> wrote:
> >
> > On Fri, Sep 12, 2025 at 2:26 PM Ashutosh Bapat
> > <ashutosh.bapat.oss@gmail.com> wrote:
> > >
> > > On Fri, Sep 12, 2025 at 9:38 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > >
> > > >
> > > > One thing related to this which needs a discussion is after this
> > > > change, it is possible that part of the transaction contains
> > > > additional logical_wal_info. I couldn't think of a problem due to this
> > > > but users using pg_waldump or other WAL reading utilities could
> > > > question this. One possibility is that we always start including
> > > > logical_wal_info for the next new transaction but not sure if that is
> > > > required. It would be good if other people involved in the discussion
> > > > or otherwise could share their opinion on this point.
> > > >
> > >
> > > AFAIR, logical info is a separate section in a WAL record, and there
> > > is not marker which says "WAL will contain logical info henceforth".
> > > So the utilities should be checking for the existence of such info
> > > before reading it. So I guess it should be ok. Some extra sensitive
> > > utilities may expect that once a WAL record has logical info, all the
> > > succeeding WAL records will have it. They may find it troublesome that
> > > WAL records with and without logical info are interleaved. Generally,
> > > I would prefer that presence/absence of logical info changes at
> > > transaction boundaries, but we will still have interleaving WAL
> > > records. So I doubt how much that matters.
> > >
> > > Sorry for jumping late in the discussion. I have a few comments,
> > > mostly superficial ones. I am yet to take a deeper look at the
> > > synchronization logic.
> >
> > I started looking at the synchronization logic but stumbled at
> >
> > @@ -5100,6 +5139,7 @@ BootStrapXLOG(uint32 data_checksum_version)
> > checkPoint.ThisTimeLineID = BootstrapTimeLineID;
> > checkPoint.PrevTimeLineID = BootstrapTimeLineID;
> > checkPoint.fullPageWrites = fullPageWrites;
> > + checkPoint.logicalDecodingEnabled = IsLogicalDecodingEnabled();
> >
> > At the time of bootstrapping, logical decoding is solely dependent on
> > the boot_val of wal_level as there will not be any logical slots.
> > Above code however does not make this clear. If we were to change the
> > boot value of wal_level to logical this leads to a misleading
> > CHECKPOINT_SHUTDOWN record being added at the time of bootstrap like
> > below.
> > rmgr: XLOG len (rec/tot): 122/ 122, tx: 0, lsn: 0/01000028, prev
> > 0/00000000, desc: CHECKPOINT_SHUTDOWN redo 0/01000028; tli 1; prev tli
> > 1; fpw true; wal_level logical; logical decoding false; xid 0:3; oid
> > 10000; multi 1; offset 0; oldest xid 3 in DB 1; oldest multi 1 in DB
> > 1; oldest/newest commit timestamp xid: 0/0; oldest running xid 0;
> > shutdown
> >
> > This soon gets corrected by the following WAL record
> > rmgr: XLOG len (rec/tot): 27/ 27, tx: 0, lsn: 0/010000A8, prev
> > 0/01000028, desc: LOGICAL_DECODING_STATUS_CHANGE true
> >
> > So beyond misleading a code reader or someone who is reading the WAL,
> > this does not have any functional impact. But maybe we should consider
> > making this a bit more clear by setting
> > checkPoint.logicalDecodingEnabled based on wal_level in
> > BootStrapXLOG(). Whether we change the code or not, I think we should
> > add a comment to explain this code.
>
> I agree that calling IsLogicalDecodingEnabled() in BootStrapXLOG()
> could be quite confusing. I think we can directly set false there and
> add some comments for those who try to change the default wal_level
> value.

Or just set the value based on the wal_level.

--
Best Wishes,
Ashutosh Bapat



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Sun, Sep 21, 2025 at 8:40 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
>
> On Fri, Sep 19, 2025 at 10:49 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Fri, Sep 19, 2025 at 7:45 AM Ashutosh Bapat
> > <ashutosh.bapat.oss@gmail.com> wrote:
> > >
> > > On Fri, Sep 12, 2025 at 2:26 PM Ashutosh Bapat
> > > <ashutosh.bapat.oss@gmail.com> wrote:
> > > >
> > > > On Fri, Sep 12, 2025 at 9:38 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > >
> > > > >
> > > > > One thing related to this which needs a discussion is after this
> > > > > change, it is possible that part of the transaction contains
> > > > > additional logical_wal_info. I couldn't think of a problem due to this
> > > > > but users using pg_waldump or other WAL reading utilities could
> > > > > question this. One possibility is that we always start including
> > > > > logical_wal_info for the next new transaction but not sure if that is
> > > > > required. It would be good if other people involved in the discussion
> > > > > or otherwise could share their opinion on this point.
> > > > >
> > > >
> > > > AFAIR, logical info is a separate section in a WAL record, and there
> > > > is not marker which says "WAL will contain logical info henceforth".
> > > > So the utilities should be checking for the existence of such info
> > > > before reading it. So I guess it should be ok. Some extra sensitive
> > > > utilities may expect that once a WAL record has logical info, all the
> > > > succeeding WAL records will have it. They may find it troublesome that
> > > > WAL records with and without logical info are interleaved. Generally,
> > > > I would prefer that presence/absence of logical info changes at
> > > > transaction boundaries, but we will still have interleaving WAL
> > > > records. So I doubt how much that matters.
> > > >
> > > > Sorry for jumping late in the discussion. I have a few comments,
> > > > mostly superficial ones. I am yet to take a deeper look at the
> > > > synchronization logic.
> > >
> > > I started looking at the synchronization logic but stumbled at
> > >
> > > @@ -5100,6 +5139,7 @@ BootStrapXLOG(uint32 data_checksum_version)
> > > checkPoint.ThisTimeLineID = BootstrapTimeLineID;
> > > checkPoint.PrevTimeLineID = BootstrapTimeLineID;
> > > checkPoint.fullPageWrites = fullPageWrites;
> > > + checkPoint.logicalDecodingEnabled = IsLogicalDecodingEnabled();
> > >
> > > At the time of bootstrapping, logical decoding is solely dependent on
> > > the boot_val of wal_level as there will not be any logical slots.
> > > Above code however does not make this clear. If we were to change the
> > > boot value of wal_level to logical this leads to a misleading
> > > CHECKPOINT_SHUTDOWN record being added at the time of bootstrap like
> > > below.
> > > rmgr: XLOG len (rec/tot): 122/ 122, tx: 0, lsn: 0/01000028, prev
> > > 0/00000000, desc: CHECKPOINT_SHUTDOWN redo 0/01000028; tli 1; prev tli
> > > 1; fpw true; wal_level logical; logical decoding false; xid 0:3; oid
> > > 10000; multi 1; offset 0; oldest xid 3 in DB 1; oldest multi 1 in DB
> > > 1; oldest/newest commit timestamp xid: 0/0; oldest running xid 0;
> > > shutdown
> > >
> > > This soon gets corrected by the following WAL record
> > > rmgr: XLOG len (rec/tot): 27/ 27, tx: 0, lsn: 0/010000A8, prev
> > > 0/01000028, desc: LOGICAL_DECODING_STATUS_CHANGE true
> > >
> > > So beyond misleading a code reader or someone who is reading the WAL,
> > > this does not have any functional impact. But maybe we should consider
> > > making this a bit more clear by setting
> > > checkPoint.logicalDecodingEnabled based on wal_level in
> > > BootStrapXLOG(). Whether we change the code or not, I think we should
> > > add a comment to explain this code.
> >
> > I agree that calling IsLogicalDecodingEnabled() in BootStrapXLOG()
> > could be quite confusing. I think we can directly set false there and
> > add some comments for those who try to change the default wal_level
> > value.
>
> Or just set the value based on the wal_level.

Agreed.

I've attached the updated patch. It incorporates all comments I got so
far and implements to lazily disable logical decoding. It's used only
when the process tries to disable logical decoding during process
exit.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Вложения

Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
shveta malik
Дата:
On Tue, Sep 23, 2025 at 3:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
>
> I've attached the updated patch. It incorporates all comments I got so
> far and implements to lazily disable logical decoding. It's used only
> when the process tries to disable logical decoding during process
> exit.
>

I am resuming the review now. I agree with the discussion of lazily
disabling logical decoding on ERROR or process-exit for temp-slot.

Few  initial comments:

1)
I see that on standby too, during proc-exit, we set 'pending_disable'.
But it never resets it, as DisableLogicalDecodingIfNecessary is no-op
on standby. And thus the checkpoint keeps on attempting to reset it
everytime. Do we even need to set it on standby?

Logfile has repeated: 'start completing pending logical decoding
disable request'

2)
+ ereport(LOG,
+ (errmsg("skip disabling logical decoding as during process exit")));

'as' not needed.

thanks
Shveta



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Thu, Sep 25, 2025 at 4:57 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Tue, Sep 23, 2025 at 3:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> >
> > I've attached the updated patch. It incorporates all comments I got so
> > far and implements to lazily disable logical decoding. It's used only
> > when the process tries to disable logical decoding during process
> > exit.
> >
>
> I am resuming the review now. I agree with the discussion of lazily
> disabling logical decoding on ERROR or process-exit for temp-slot.
>
> Few  initial comments:

Thank you for the comments!

>
> 1)
> I see that on standby too, during proc-exit, we set 'pending_disable'.
> But it never resets it, as DisableLogicalDecodingIfNecessary is no-op
> on standby. And thus the checkpoint keeps on attempting to reset it
> everytime. Do we even need to set it on standby?
>
> Logfile has repeated: 'start completing pending logical decoding
> disable request'

Ugh, I missed that part. I think that standbys should not delegate the
deactivation to the checkpointer uless the deactivation is actually
required.

> 2)
> + ereport(LOG,
> + (errmsg("skip disabling logical decoding as during process exit")));
>
> 'as' not needed.

I've fixed the above two points and attached the new version patch.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Вложения

Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
shveta malik
Дата:
On Fri, Sep 26, 2025 at 12:46 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Sep 25, 2025 at 4:57 AM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Tue, Sep 23, 2025 at 3:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > >
> > > I've attached the updated patch. It incorporates all comments I got so
> > > far and implements to lazily disable logical decoding. It's used only
> > > when the process tries to disable logical decoding during process
> > > exit.
> > >
> >
> > I am resuming the review now. I agree with the discussion of lazily
> > disabling logical decoding on ERROR or process-exit for temp-slot.
> >
> > Few  initial comments:
>
> Thank you for the comments!
>
> >
> > 1)
> > I see that on standby too, during proc-exit, we set 'pending_disable'.
> > But it never resets it, as DisableLogicalDecodingIfNecessary is no-op
> > on standby. And thus the checkpoint keeps on attempting to reset it
> > everytime. Do we even need to set it on standby?
> >
> > Logfile has repeated: 'start completing pending logical decoding
> > disable request'
>
> Ugh, I missed that part. I think that standbys should not delegate the
> deactivation to the checkpointer uless the deactivation is actually
> required.
>
> > 2)
> > + ereport(LOG,
> > + (errmsg("skip disabling logical decoding as during process exit")));
> >
> > 'as' not needed.
>
> I've fixed the above two points and attached the new version patch.
>

Thanks.

1)
Currently, in the existing implementation, if a promotion is in
progress (delay_status_change = true) and, during that time, a process
exits (causing a temporary slot to be released), then on the standby,
we may end up setting pending_disable. As a result, the checkpointer
will have to wait for the transition to complete before it can proceed
with disabling logical decoding (if needed).

a)
This means the checkpoint may be delayed further, depending on how
long it takes for all processes to respond to ProcSignalBarrier().

b)
Additionally, consider the case where the promotion fails midway
(after UpdateLogicalDecodingStatusEndOfRecovery). If the checkpointer
still sees RecoveryInProgress and delay_status_change as true, could
it end up waiting indefinitely for the transition to complete? In my
testing, when promotion fails and the startup process exits, it
usually causes the rest of the processes, including the checkpointer,
to terminate as well. So, it seems that a dangling pending_disable
state may not actually occur on standby in practice.

I believe scenario (b) can't really happen, but I still wanted to
check with you.

I am not sure if (a) is a real concern — what’s your take on it?

2)
As per discussion in [1], there was a proposal to implement lazily
disabling decoding both in ERROR and proc-exit scenarios. But I see it
only implemented in proc-exit scenario. Are we planning to do it for
ERROR as well?

[1]: https://www.postgresql.org/message-id/CAA4eK1JVNbb-OT1PO%3DiOFG1KA__Q83n8cLZoDjF2yA1rZyvCnA%40mail.gmail.com

thanks
Shveta



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Ashutosh Bapat
Дата:
On Fri, Sep 26, 2025 at 11:13 AM shveta malik <shveta.malik@gmail.com> wrote:
>

Replying to this message to avoid forking the thread.

Some more comments:
+/*
+ * This function does several kinds of preparation works required to start
+ * the process of logical decoding status change. If the status change is
+ * required, it ensures we can change logical decoding status, setting
+ * LogicalDecodingCtl->status_change_inprogress on, and returns true.
+ * Otherwise, if it's not required or not allowed (e.g., during recovery
+ * or wal_level = 'logical'), it returns false.
+ */
+static bool
+start_logical_decoding_status_change(bool new_status)
+{
+ Assert(!RecoveryInProgress());

The prologue mentions recovery as a case when we can't change status.
Why is Assert here then?

+
+ /* Prepare and start the activation process if it's disabled */
+ if (!start_logical_decoding_status_change(true))

The prologue of start_logical_decoding_status_change() mentions that
the function returns false if status change is not allowed. If the
function returns false because the status change is not allowed, we
are simply returning from EnsureLogicalDecodingEnabled(). The caller
will assume that logical decoding is enabled. Doesn't it look like a
missing case. May be the code is correct but comments aren't
explaining the situation well.

If a new replication slot is getting created while the last remaining
slot is getting dropped, looks like we rely on
LogicalDecodingControlLock to serialize the operations. Is that
correct? I don't see an injection point in this code path. So I am
imagine there's no direct test for this case.

+void
+DisableLogicalDecodingIfNecessary(bool complete_pending)
+{
+ bool need_wait = false;
+
+ /*
+ * Both complete_pending and proc_exit_inprogress must not be true at the
+ * same time.
+ */

That's evident from the assertion. I feel the comment should explain the reason.

+
+ /* With 'minimal' WAL level, logical decoding is always disabled */
+ if (wal_level == WAL_LEVEL_MINIMAL)
+ return;

Assume a server which has a standby running with wal_level =
'replica'. If we shut it down and restart it with wal_level =
'minimal', that won't be allowed. Hence we don't need to write a
XLOG_LOGICAL_DECODING_STATUS_CHANGE record during recovery in that
case. Do you think this should be documented somewhere? I did spend a
couple hours trying this out before I realized why it is safe to
return from here. May be a testcase for the same?

ReplicationSlotCleanup(bool synced_only)
{
LWLockRelease(ReplicationSlotControlLock);
+
... snip ...
+ if (dropped_logical && nlogicalslots == 0)
+ DisableLogicalDecodingIfNecessary(false);

Some magic going on here. If dropped_logical is true, at least one of
the loops dropped a logical slot. If nlogicalslots is zero, it means
that the last loop did not find any logical slots. When both the
conditions are true, it means that the function dropped the last
logical slot. Hence it calls DisableLogicalDecodingIfNecessary(). Is
that correct? I think we need some comments explaining this. Also
there's a tiny window between dropping the last logical slot and
calling DisableLogicalDecodingIfNecessary() when another logical slot
could be created. But DisableLogicalDecodingIfNecessary() has checks
to avoid disabling logical decoding in that case. I don't think all of
that is clear from the code changes. Maybe that's because of the way
this function is written in the first place. For example, why do we
need multiple loops here? Why not just one loop that drops all the
slots that need to be dropped? It's not as if the same backend is
going to create new temporary slots while it is cleaning up existing
ones.

Same goes for ReplicationSlotsDropDBSlots(). But there it is at least
possible that concurrent backends may have created a slot, so a loop
there is justified.
+ /* Initialize logical info WAL logging state */
+ InitializeProcessXLogLogicalInfo();
+
/*
* Initialize replication slots after pgstat. The exit hook might need to
* drop ephemeral slots, which in turn triggers stats reporting.

Do we need XlogLogicalInfo to be set before initializing replication
slots? Do we need to update the comment here?
--
Best Wishes,
Ashutosh Bapat



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
shveta malik
Дата:
Few comments:

1)
+ /*
+ * Logical decoding is normally disabled after dropping the last logical
+ * slot, but if it happens during process exit time for example when
+ * releasing a temporary logical slot on an error, the process sets this
+ * flag to true, delegating the checkpointer to disable logical decoding
+ * asynchronously.
+ */
+ bool pending_disable;

I do not see it happening while releasing a temporary logical slot on
an error (without process-exit). Also it happens on clean process-exit
(without hitting any ERROR). We should make the comment more clear.


2)
+ /*
+ * This flag is set to true by the startup process during recovery, to
+ * delay any logical decoding status change attempts until the recovery
+ * actually completes. The flag is set to true only during the recovery by
+ * the startup process. See comments in
+ * start_logical_decoding_status_change() for details.
+ */
+ bool delay_status_change;

The second line in the comment looks repetitive.

3)
+ if (!found)
+ {
+ LogicalDecodingCtl->xlog_logical_info = false;
+ LogicalDecodingCtl->logical_decoding_enabled = false;
+ LogicalDecodingCtl->status_change_inprogress = false;
+ LogicalDecodingCtl->pending_disable = false;
+ LogicalDecodingCtl->delay_status_change = false;
+ ConditionVariableInit(&LogicalDecodingCtl->cv);
+ }

Shall we do MemSet to 0 and then 'ConditionVariableInit' instead of
initializing all the fields to false?

4)
+ * Otherwise, if it's not required or not allowed (e.g., during recovery
+ * or wal_level = 'logical'), it returns false.
+  */
+static bool
+start_logical_decoding_status_change(bool new_status)
+{
+ Assert(!RecoveryInProgress());

We moved the 'recovery' and 'wal_level' checks outside but I think we
missed updating the comments here.

5)
+ /*
+ * When attempting to disable logical decoding, if there is at least one
+ * logical slots we cannot disable it.
+ */

little correction:
/*
 * When attempting to disable logical decoding, if there is at least one
 * logical slot, we cannot disable it.
 */

6)
+ * and slot creation. To ensure enabling logical decoding the caller

comma missing:
* and slot creation. To ensure enabling logical decoding, the caller

7)
+ if (RecoveryInProgress())
+ {
+ /*
+ * Check if we need to wait for the recovery completion. See the
+ * comments in check_wait_for_recovery_completion() for the reason why
+ * we check it here.
+ */
+ if (!check_wait_for_recovery_completion())
+ return;
+
+ wait_for_recovery_completion();
+ }

It would be helpful to also add that logical decoding changes are not
supported on a standby. Therefore, this function will be a no-op in
that scenario (provided there is no wait needed for recovery
completion).

I think this particular comment somehow got lost in all these iterations.

8)
+void
+DisableLogicalDecodingIfNecessary(bool complete_pending)

'complete_pending' looks a little odd to me. Shall we have 'finish_disable'?

thanks
Shveta



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Thu, Sep 25, 2025 at 10:43 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Fri, Sep 26, 2025 at 12:46 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Thu, Sep 25, 2025 at 4:57 AM shveta malik <shveta.malik@gmail.com> wrote:
> > >
> > > On Tue, Sep 23, 2025 at 3:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > >
> > > >
> > > > I've attached the updated patch. It incorporates all comments I got so
> > > > far and implements to lazily disable logical decoding. It's used only
> > > > when the process tries to disable logical decoding during process
> > > > exit.
> > > >
> > >
> > > I am resuming the review now. I agree with the discussion of lazily
> > > disabling logical decoding on ERROR or process-exit for temp-slot.
> > >
> > > Few  initial comments:
> >
> > Thank you for the comments!
> >
> > >
> > > 1)
> > > I see that on standby too, during proc-exit, we set 'pending_disable'.
> > > But it never resets it, as DisableLogicalDecodingIfNecessary is no-op
> > > on standby. And thus the checkpoint keeps on attempting to reset it
> > > everytime. Do we even need to set it on standby?
> > >
> > > Logfile has repeated: 'start completing pending logical decoding
> > > disable request'
> >
> > Ugh, I missed that part. I think that standbys should not delegate the
> > deactivation to the checkpointer uless the deactivation is actually
> > required.
> >
> > > 2)
> > > + ereport(LOG,
> > > + (errmsg("skip disabling logical decoding as during process exit")));
> > >
> > > 'as' not needed.
> >
> > I've fixed the above two points and attached the new version patch.
> >
>
> Thanks.
>
> 1)
> Currently, in the existing implementation, if a promotion is in
> progress (delay_status_change = true) and, during that time, a process
> exits (causing a temporary slot to be released), then on the standby,
> we may end up setting pending_disable. As a result, the checkpointer
> will have to wait for the transition to complete before it can proceed
> with disabling logical decoding (if needed).
>
> a)
> This means the checkpoint may be delayed further, depending on how
> long it takes for all processes to respond to ProcSignalBarrier().
>
> b)
> Additionally, consider the case where the promotion fails midway
> (after UpdateLogicalDecodingStatusEndOfRecovery). If the checkpointer
> still sees RecoveryInProgress and delay_status_change as true, could
> it end up waiting indefinitely for the transition to complete? In my
> testing, when promotion fails and the startup process exits, it
> usually causes the rest of the processes, including the checkpointer,
> to terminate as well. So, it seems that a dangling pending_disable
> state may not actually occur on standby in practice.
>
> I believe scenario (b) can't really happen, but I still wanted to
> check with you.

I think so. We don't allow the system to continue starting up if the
startup process exits with an error.

> I am not sure if (a) is a real concern — what’s your take on it?

Since the startup sets delay_status_change=true after creating a
checkpoint by PerformRecoveryXLogAction(), I thought that it would not
be a problem in practice even if the checkpointer ends up waiting for
the recovery to complete. On the other hand, once the process
delegated the deactivation to the checkpointer, it would also be okay
not to disable logical decoding at its first attempt. One required
change would be that if the checkpointer also skips the deactivation
when delay_status_change=true, the startup would need to wake up the
checkpointer after completing the recovery. Otherwise, the
checkpointer might not disable logical decoding until the next
checkpoint time. I wanted to avoid adding this part but I'm open to
other opinions.

> 2)
> As per discussion in [1], there was a proposal to implement lazily
> disabling decoding both in ERROR and proc-exit scenarios. But I see it
> only implemented in proc-exit scenario. Are we planning to do it for
> ERROR as well?

After more thoughts, I realized that I missed the fact that we
actually wrote an ABORT record during the process shutdown.
ShutdownPostgres() that calls AbortOutOfAnyTransaction() is the last
callback in before_shmem_exit callbacks. So it's probably okay to
write STATUS_CHANGE record to disable logical decoding even during
process shutdown.

As for the race condition at the end of recovery between the startup
process and processes updating the logical decoding status, we use
delay_status_change flag so that any logical decoding status change
initiated in the particular window (i.e., between the startup sets
delay_status_change and the recovery completes) has to wait for the
startup to complete all end-of-recovery actions. An alternative idea
would be that we allow processes to write STATUS_CHANGE records in the
particular window even during recovery, by using
LocalSetXLogInsertAllowed().

If we implement these ideas, we can simplify the patch quite well as
we no longer need the lazy behavior nor wait for the recovery to
complete. I've attached a PoC patch that can be applied on top of the
v15 patch.

Feedback is very welcome.


Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Вложения

Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
shveta malik
Дата:
On Tue, Sep 30, 2025 at 7:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Thu, Sep 25, 2025 at 10:43 PM shveta malik <shveta.malik@gmail.com> wrote:
> >
> > On Fri, Sep 26, 2025 at 12:46 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Thu, Sep 25, 2025 at 4:57 AM shveta malik <shveta.malik@gmail.com> wrote:
> > > >
> > > > On Tue, Sep 23, 2025 at 3:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > > > >
> > > > >
> > > > > I've attached the updated patch. It incorporates all comments I got so
> > > > > far and implements to lazily disable logical decoding. It's used only
> > > > > when the process tries to disable logical decoding during process
> > > > > exit.
> > > > >
> > > >
> > > > I am resuming the review now. I agree with the discussion of lazily
> > > > disabling logical decoding on ERROR or process-exit for temp-slot.
> > > >
> > > > Few  initial comments:
> > >
> > > Thank you for the comments!
> > >
> > > >
> > > > 1)
> > > > I see that on standby too, during proc-exit, we set 'pending_disable'.
> > > > But it never resets it, as DisableLogicalDecodingIfNecessary is no-op
> > > > on standby. And thus the checkpoint keeps on attempting to reset it
> > > > everytime. Do we even need to set it on standby?
> > > >
> > > > Logfile has repeated: 'start completing pending logical decoding
> > > > disable request'
> > >
> > > Ugh, I missed that part. I think that standbys should not delegate the
> > > deactivation to the checkpointer uless the deactivation is actually
> > > required.
> > >
> > > > 2)
> > > > + ereport(LOG,
> > > > + (errmsg("skip disabling logical decoding as during process exit")));
> > > >
> > > > 'as' not needed.
> > >
> > > I've fixed the above two points and attached the new version patch.
> > >
> >
> > Thanks.
> >
> > 1)
> > Currently, in the existing implementation, if a promotion is in
> > progress (delay_status_change = true) and, during that time, a process
> > exits (causing a temporary slot to be released), then on the standby,
> > we may end up setting pending_disable. As a result, the checkpointer
> > will have to wait for the transition to complete before it can proceed
> > with disabling logical decoding (if needed).
> >
> > a)
> > This means the checkpoint may be delayed further, depending on how
> > long it takes for all processes to respond to ProcSignalBarrier().
> >
> > b)
> > Additionally, consider the case where the promotion fails midway
> > (after UpdateLogicalDecodingStatusEndOfRecovery). If the checkpointer
> > still sees RecoveryInProgress and delay_status_change as true, could
> > it end up waiting indefinitely for the transition to complete? In my
> > testing, when promotion fails and the startup process exits, it
> > usually causes the rest of the processes, including the checkpointer,
> > to terminate as well. So, it seems that a dangling pending_disable
> > state may not actually occur on standby in practice.
> >
> > I believe scenario (b) can't really happen, but I still wanted to
> > check with you.
>
> I think so. We don't allow the system to continue starting up if the
> startup process exits with an error.

Okay.

>
> > I am not sure if (a) is a real concern — what’s your take on it?
>
> Since the startup sets delay_status_change=true after creating a
> checkpoint by PerformRecoveryXLogAction(), I thought that it would not
> be a problem in practice even if the checkpointer ends up waiting for
> the recovery to complete. On the other hand, once the process
> delegated the deactivation to the checkpointer, it would also be okay
> not to disable logical decoding at its first attempt. One required
> change would be that if the checkpointer also skips the deactivation
> when delay_status_change=true, the startup would need to wake up the
> checkpointer after completing the recovery. Otherwise, the
> checkpointer might not disable logical decoding until the next
> checkpoint time. I wanted to avoid adding this part but I'm open to
> other opinions.
>

I see your point. But I’ll skip further discussion on this for now and
want to focus on your second point first, because if that can be done,
this won’t be necessary.

> > 2)
> > As per discussion in [1], there was a proposal to implement lazily
> > disabling decoding both in ERROR and proc-exit scenarios. But I see it
> > only implemented in proc-exit scenario. Are we planning to do it for
> > ERROR as well?
>
> After more thoughts, I realized that I missed the fact that we
> actually wrote an ABORT record during the process shutdown.
> ShutdownPostgres() that calls AbortOutOfAnyTransaction() is the last
> callback in before_shmem_exit callbacks. So it's probably okay to
> write STATUS_CHANGE record to disable logical decoding even during
> process shutdown.

Yes, that’s correct, we write as many Abort records as there are open
transactions. And thus IMO, writing the Logical-Decoding status-change
record, which occurs at most once, should be fine. During the disable
process, we emit a PROCSIGNAL_BARRIER but don’t wait for responses
from others, so this should also be acceptable. But let’s see what
others have to say on this.

> As for the race condition at the end of recovery between the startup
> process and processes updating the logical decoding status, we use
> delay_status_change flag so that any logical decoding status change
> initiated in the particular window (i.e., between the startup sets
> delay_status_change and the recovery completes) has to wait for the
> startup to complete all end-of-recovery actions. An alternative idea
> would be that we allow processes to write STATUS_CHANGE records in the
> particular window even during recovery, by using
> LocalSetXLogInsertAllowed().
>

For everyone’s reference, I’m attaching the link to the race condition
we discussed earlier: [1].

To me, allowing status changes during that short window seems better
and simpler than the previous approach of delaying them. But I do have
one concern: Could the standby end up with an incorrect logical
decoding status if, during the promotion (when allow_status_change is
true), a slot is dropped causing the status to be disabled on the
standby, but the promotion doesn’t complete? In that case, upon
restart, since the standby remains in standby mode, it might pick up
the changed status via checkPoint.logicalDecodingEnabled, resulting in
logical decoding being disabled instead of enabled as it is on the
primary.

Is this a possibility? I haven’t had the chance to simulate and verify
this scenario yet.

> If we implement these ideas, we can simplify the patch quite well as
> we no longer need the lazy behavior nor wait for the recovery to
> complete.

Agree, if we can make these ideas work, the patch will be much simpler.

[1]:
https://www.postgresql.org/message-id/OSCPR01MB14966C5E31CA0ACAD07AF8B5FF524A%40OSCPR01MB14966.jpnprd01.prod.outlook.com

thanks
Shveta



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Tue, Sep 30, 2025 at 1:58 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Tue, Sep 30, 2025 at 7:28 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Thu, Sep 25, 2025 at 10:43 PM shveta malik <shveta.malik@gmail.com> wrote:
> > > 2)
> > > As per discussion in [1], there was a proposal to implement lazily
> > > disabling decoding both in ERROR and proc-exit scenarios. But I see it
> > > only implemented in proc-exit scenario. Are we planning to do it for
> > > ERROR as well?
> >
> > After more thoughts, I realized that I missed the fact that we
> > actually wrote an ABORT record during the process shutdown.
> > ShutdownPostgres() that calls AbortOutOfAnyTransaction() is the last
> > callback in before_shmem_exit callbacks. So it's probably okay to
> > write STATUS_CHANGE record to disable logical decoding even during
> > process shutdown.
>
> Yes, that’s correct, we write as many Abort records as there are open
> transactions. And thus IMO, writing the Logical-Decoding status-change
> record, which occurs at most once, should be fine. During the disable
> process, we emit a PROCSIGNAL_BARRIER but don’t wait for responses
> from others, so this should also be acceptable. But let’s see what
> others have to say on this.

Thank you for the comment. Agreed.

>
> > As for the race condition at the end of recovery between the startup
> > process and processes updating the logical decoding status, we use
> > delay_status_change flag so that any logical decoding status change
> > initiated in the particular window (i.e., between the startup sets
> > delay_status_change and the recovery completes) has to wait for the
> > startup to complete all end-of-recovery actions. An alternative idea
> > would be that we allow processes to write STATUS_CHANGE records in the
> > particular window even during recovery, by using
> > LocalSetXLogInsertAllowed().
> >
>
> For everyone’s reference, I’m attaching the link to the race condition
> we discussed earlier: [1].
>
> To me, allowing status changes during that short window seems better
> and simpler than the previous approach of delaying them. But I do have
> one concern: Could the standby end up with an incorrect logical
> decoding status if, during the promotion (when allow_status_change is
> true), a slot is dropped causing the status to be disabled on the
> standby, but the promotion doesn’t complete? In that case, upon
> restart, since the standby remains in standby mode, it might pick up
> the changed status via checkPoint.logicalDecodingEnabled, resulting in
> logical decoding being disabled instead of enabled as it is on the
> primary.
>
> Is this a possibility? I haven’t had the chance to simulate and verify
> this scenario yet.

I'll research more failure cases but as for the case you mentioned I
believe it's safe. If the startup process fails before completing all
end-of-recovery actions during the promotion, it raises a FATAL,
leading to a server shutdown. Also, by the time when it calls
UpdateLogicalDecodingStatusEndOfRecovery() the recovery is finished
technically; it already assigned a new timeline ID, removed the signal
file, and updated the min recovery point in the control file.
Therefore, after the server restarts, it doesn't enter the standby
mode but works as the primary server with logical decoding being
disabled, which is the correct state.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



RE: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
"Hayato Kuroda (Fujitsu)"
Дата:
Dear Sawada-san,

> If we implement these ideas, we can simplify the patch quite well as
> we no longer need the lazy behavior nor wait for the recovery to
> complete. I've attached a PoC patch that can be applied on top of the
> v15 patch.

In 0002, I found an assertion failure. Steps:

0. There is a streaming replication system and only primary has a logical slot.
1. Attached to a startup process and set a break at UpdateLogicalDecodingStatusEndOfRecovery.
2. Sent a promote signal to the standby and ensured the startup stopped.
3. Established new connection to the standby
4. Attached to the backend process and set a break at create_logical_replication_slot.
5. Tried to create a new slot on the standby and ensured the backend stopped
6. Moved the startup process till WaitForProcSignalBarrier().
7. Moved the backend process till WaitForProcSignalBarrier(). Both processes could go ahead.
8. Moved the backend till ReplicationSlotReserveWal() and restart_lsn was set.
9. Detached from the startup process. Recovery state became "DONE".
10. Detached from the backend. It would crash at xlog_decode().

Some data was obtained by the gdb, see [1].

Direct cause is that restart_lsn of the slot points the value before STATUS_CHANGE(false).
Per my analysis, ReplicationSlotReserveWal() uses GetXLogReplayRecPtr(NULL) as the
initial decode point, which is the last record the standby receives from the primary.
However, the standby can generate additional record, STATUS_CHANGE (false) in
this case. After the recovery, the decoder would read the STATUS_CHANGE record,
but it breaks our assumption.

Per my understanding, this cannot happen with 0001 because EnsureLogicalDecodingEnabled()
waits till RecoveryInProgress() becomes false.

How should we fix the issue? One approach is to remove the Assert() and ereport(ERROR),
but even in the case the slot may not be able to establish the consistent snapshot.

[1]
```
#0  __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6,
    no_tid=no_tid@entry=0) at pthread_kill.c:44
#1  0x00007f432e08bf43 in __pthread_kill_internal (signo=6, threadid=<optimized out>)
    at pthread_kill.c:78
#2  0x00007f432e03eb46 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#3  0x00007f432e028833 in __GI_abort () at abort.c:79
#4  0x0000000000b96227 in ExceptionalCondition (conditionName=0xdb295d "RecoveryInProgress()",
    fileName=0xdb2928 "../postgres/src/backend/replication/logical/decode.c", lineNumber=174)
    at ../postgres/src/backend/utils/error/assert.c:65
#5  0x000000000090f986 in xlog_decode (ctx=0x2b28430, buf=0x7ffd4a2ebf10)
    at ../postgres/src/backend/replication/logical/decode.c:174
#6  0x000000000090f77f in LogicalDecodingProcessRecord (ctx=0x2b28430, record=0x2b287c8)
    at ../postgres/src/backend/replication/logical/decode.c:116
#7  0x000000000091590b in DecodingContextFindStartpoint (ctx=0x2b28430)
    at ../postgres/src/backend/replication/logical/logical.c:644
#8  0x00000000008fd9ed in create_logical_replication_slot (name=0x2a3f6c8 "slot_sync",
    plugin=0x2a3f768 "test_decoding", temporary=false, two_phase=false, failover=false,
    restart_lsn=0, find_startpoint=true) at ../postgres/src/backend/replication/slotfuncs.c:166
#9  0x00000000008fdb02 in pg_create_logical_replication_slot (fcinfo=0x2b20bd8)
    at ../postgres/src/backend/replication/slotfuncs.c:196
...
(gdb) f 7
#7  0x000000000091590b in DecodingContextFindStartpoint (ctx=0x2b28430)
    at ../postgres/src/backend/replication/logical/logical.c:644
644                     LogicalDecodingProcessRecord(ctx, ctx->reader);
(gdb) printf "%X\n", slot->data.restart_lsn
30000F0
(gdb) q
$ pg_waldump data_sta/pg_wal/000000020000000000000003  | grep "30000F0"
pg_waldump: error: error in WAL record at 0/03000318: invalid record length at 0/03000398: expected at least 24, got 0
rmgr: XLOG        len (rec/tot):     50/    50, tx:          0, lsn: 0/030000F0, prev 0/030000B8, desc: END_OF_RECOVERY
tli2; prev tli 1; time 2025-10-01 18:54:53.971277 JST; wal_level replica 
rmgr: XLOG        len (rec/tot):     27/    27, tx:          0, lsn: 0/03000128, prev 0/030000F0, desc:
LOGICAL_DECODING_STATUS_CHANGEfalse 
```

Best regards,
Hayato Kuroda
FUJITSU LIMITED




Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Wed, Oct 1, 2025 at 7:01 AM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Dear Sawada-san,
>
> > If we implement these ideas, we can simplify the patch quite well as
> > we no longer need the lazy behavior nor wait for the recovery to
> > complete. I've attached a PoC patch that can be applied on top of the
> > v15 patch.
>
> In 0002, I found an assertion failure. Steps:
>
> 0. There is a streaming replication system and only primary has a logical slot.
> 1. Attached to a startup process and set a break at UpdateLogicalDecodingStatusEndOfRecovery.
> 2. Sent a promote signal to the standby and ensured the startup stopped.
> 3. Established new connection to the standby
> 4. Attached to the backend process and set a break at create_logical_replication_slot.
> 5. Tried to create a new slot on the standby and ensured the backend stopped
> 6. Moved the startup process till WaitForProcSignalBarrier().
> 7. Moved the backend process till WaitForProcSignalBarrier(). Both processes could go ahead.
> 8. Moved the backend till ReplicationSlotReserveWal() and restart_lsn was set.
> 9. Detached from the startup process. Recovery state became "DONE".
> 10. Detached from the backend. It would crash at xlog_decode().
>
> Some data was obtained by the gdb, see [1].

Thank you for testing the patch! Good catch.

> Direct cause is that restart_lsn of the slot points the value before STATUS_CHANGE(false).
> Per my analysis, ReplicationSlotReserveWal() uses GetXLogReplayRecPtr(NULL) as the
> initial decode point, which is the last record the standby receives from the primary.
> However, the standby can generate additional record, STATUS_CHANGE (false) in
> this case. After the recovery, the decoder would read the STATUS_CHANGE record,
> but it breaks our assumption.
>
> Per my understanding, this cannot happen with 0001 because EnsureLogicalDecodingEnabled()
> waits till RecoveryInProgress() becomes false.
>
> How should we fix the issue? One approach is to remove the Assert() and ereport(ERROR),
> but even in the case the slot may not be able to establish the consistent snapshot.

I think that the problem stems from the fact that the patch sets
allow_status_change to true before completing all end-of-recovery
actions for logical decoding status update. I think it should be done
at the end of UpdateLogicalDecodingStatusEndOfRecovery(), i.e., after
WaitForProcSignalBarrier(). That way, the logical decoding can always
start from after the point where the startup updated the logical
decoding status.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Amit Kapila
Дата:
On Wed, Oct 1, 2025 at 4:31 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Dear Sawada-san,
>
> > If we implement these ideas, we can simplify the patch quite well as
> > we no longer need the lazy behavior nor wait for the recovery to
> > complete. I've attached a PoC patch that can be applied on top of the
> > v15 patch.
>
> In 0002, I found an assertion failure. Steps:
>
> 0. There is a streaming replication system and only primary has a logical slot.
> 1. Attached to a startup process and set a break at UpdateLogicalDecodingStatusEndOfRecovery.
> 2. Sent a promote signal to the standby and ensured the startup stopped.
> 3. Established new connection to the standby
> 4. Attached to the backend process and set a break at create_logical_replication_slot.
> 5. Tried to create a new slot on the standby and ensured the backend stopped
> 6. Moved the startup process till WaitForProcSignalBarrier().
> 7. Moved the backend process till WaitForProcSignalBarrier(). Both processes could go ahead.
> 8. Moved the backend till ReplicationSlotReserveWal() and restart_lsn was set.
> 9. Detached from the startup process. Recovery state became "DONE".
> 10. Detached from the backend. It would crash at xlog_decode().
>
> Some data was obtained by the gdb, see [1].
>
> Direct cause is that restart_lsn of the slot points the value before STATUS_CHANGE(false).
> Per my analysis, ReplicationSlotReserveWal() uses GetXLogReplayRecPtr(NULL) as the
> initial decode point, which is the last record the standby receives from the primary.
> However, the standby can generate additional record, STATUS_CHANGE (false) in
> this case. After the recovery, the decoder would read the STATUS_CHANGE record,
> but it breaks our assumption.
>
> Per my understanding, this cannot happen with 0001 because EnsureLogicalDecodingEnabled()
> waits till RecoveryInProgress() becomes false.
>
> How should we fix the issue? One approach is to remove the Assert() and ereport(ERROR),
> but even in the case the slot may not be able to establish the consistent snapshot.
>

The other point to consider is that during promotion after
UpdateLogicalDecodingStatusEndOfRecovery(), we have multiple things
that seems to be necessary to perform before backends are allowed to
write. For example, refer to comments: "If any of the critical GUCs
have changed, log them before we allow backends to write WAL.*/. I
think the key thing is that before we set state DB_IN_PRODUCTION in
ControlFile and mark SharedRecoverstate as RECOVERY_STATE_DONE,
backends shouldn't be allowed to write WAL. If we want to take an
exception for writing a WAL during slot_creation before the
RECOVERY_STATE_DONE is set, we should analyze and explain in comments
why it is okay to take this exception.

--
With Regards,
Amit Kapila.



RE: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
"Hayato Kuroda (Fujitsu)"
Дата:
Dear Sawada-san,

> 
> I think that the problem stems from the fact that the patch sets
> allow_status_change to true before completing all end-of-recovery
> actions for logical decoding status update. I think it should be done
> at the end of UpdateLogicalDecodingStatusEndOfRecovery(), i.e., after
> WaitForProcSignalBarrier(). That way, the logical decoding can always
> start from after the point where the startup updated the logical
> decoding status.

Assuming the fix like [1], and it seems to work well. In the workload I shared,
the backend process cannot consume the ProcSignal emitted by the startup, thus
the status change on the shared memory is not allowed. The backend would fail to
create the replication slot and effective_wal_level would be replica.

[1]:
```
--- a/src/backend/replication/logical/logicalctl.c
+++ b/src/backend/replication/logical/logicalctl.c
@@ -532,13 +532,11 @@ UpdateLogicalDecodingStatusEndOfRecovery(void)
         * processes to write XLOG_LOGICAL_DECODING_STATUS_CHANGE records prior to
         * completing all end-of-recovery actions.
         */
-       LogicalDecodingCtl->allow_status_change = true;
-
-       LWLockRelease(LogicalDecodingControlLock);
 
        if (need_wal)
                CreateLogicalDecodingStatusChangeRecord(new_status);
 
+       LWLockRelease(LogicalDecodingControlLock);
        /*
         * Ensure all running processes have the updated status. We don't need to
         * wait for running transactions to finish as we don't accept any writes
@@ -550,5 +548,9 @@ UpdateLogicalDecodingStatusEndOfRecovery(void)
                WaitForProcSignalBarrier(

EmitProcSignalBarrier(PROCSIGNAL_BARRIER_UPDATE_XLOG_LOGICAL_INFO));
 
+       LWLockAcquire(LogicalDecodingControlLock, LW_EXCLUSIVE);
+       LogicalDecodingCtl->allow_status_change = true;
+       LWLockRelease(LogicalDecodingControlLock);
```

Best regards,
Hayato Kuroda
FUJITSU LIMITED


RE: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
"Hayato Kuroda (Fujitsu)"
Дата:
I found that after the fix I proposed [1], there is a possibility that effective_wal_level
could be logical after the promotion, even after the logical slots are dropped [2].
Steps:

0. Setup a streaming replication system, and both nodes had a replication slot
1. Attached to the startup and added a break at UpdateLogicalDecodingStatusEndOfRecovery
2. Sent a promotion request to the standby. Startup would stop
3. Established a connection to the standby.
4. Attached to the backend and added a break at ReplicationSlotDrop
5. Tried to drop the replication slot on the standby. Backend would stop
6. Moved the startup till WaitForProcSignalBarrier(). Note that allow_status_change
   was still off.
7. Detached from the backend process.
8. Detached from the startup process.

This can happen because UpdateLogicalDecodingStatusEndOfRecovery() decided to
keep wal_level logical, and upcoming DisableLogicalDecodingIfNecessary() cannot
disable it. allow_status_change should be true for the case.

I considered an approach not to release lock while waiting the ProcSignal, but
other processes cannot not read and update xlog_logical_info.

[1]:
https://www.postgresql.org/message-id/OSCPR01MB14966B8F6F728F3FB4B05BFDBF5E7A%40OSCPR01MB14966.jpnprd01.prod.outlook.com
[2]
```
postgres=# SELECT pg_is_in_recovery();
 pg_is_in_recovery
-------------------
 f
(1 row)

postgres=# SHOW effective_wal_level ;
 effective_wal_level
---------------------
 logical
(1 row)

postgres=# SELECT count(*) FROM pg_replication_slots ;
 count
-------
     0
(1 row)
```

Best regards,
Hayato Kuroda
FUJITSU LIMITED




Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

От
Masahiko Sawada
Дата:
On Wed, Oct 1, 2025 at 10:48 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Wed, Oct 1, 2025 at 4:31 PM Hayato Kuroda (Fujitsu)
> <kuroda.hayato@fujitsu.com> wrote:
> >
> > Dear Sawada-san,
> >
> > > If we implement these ideas, we can simplify the patch quite well as
> > > we no longer need the lazy behavior nor wait for the recovery to
> > > complete. I've attached a PoC patch that can be applied on top of the
> > > v15 patch.
> >
> > In 0002, I found an assertion failure. Steps:
> >
> > 0. There is a streaming replication system and only primary has a logical slot.
> > 1. Attached to a startup process and set a break at UpdateLogicalDecodingStatusEndOfRecovery.
> > 2. Sent a promote signal to the standby and ensured the startup stopped.
> > 3. Established new connection to the standby
> > 4. Attached to the backend process and set a break at create_logical_replication_slot.
> > 5. Tried to create a new slot on the standby and ensured the backend stopped
> > 6. Moved the startup process till WaitForProcSignalBarrier().
> > 7. Moved the backend process till WaitForProcSignalBarrier(). Both processes could go ahead.
> > 8. Moved the backend till ReplicationSlotReserveWal() and restart_lsn was set.
> > 9. Detached from the startup process. Recovery state became "DONE".
> > 10. Detached from the backend. It would crash at xlog_decode().
> >
> > Some data was obtained by the gdb, see [1].
> >
> > Direct cause is that restart_lsn of the slot points the value before STATUS_CHANGE(false).
> > Per my analysis, ReplicationSlotReserveWal() uses GetXLogReplayRecPtr(NULL) as the
> > initial decode point, which is the last record the standby receives from the primary.
> > However, the standby can generate additional record, STATUS_CHANGE (false) in
> > this case. After the recovery, the decoder would read the STATUS_CHANGE record,
> > but it breaks our assumption.
> >
> > Per my understanding, this cannot happen with 0001 because EnsureLogicalDecodingEnabled()
> > waits till RecoveryInProgress() becomes false.
> >
> > How should we fix the issue? One approach is to remove the Assert() and ereport(ERROR),
> > but even in the case the slot may not be able to establish the consistent snapshot.
> >
>
> The other point to consider is that during promotion after
> UpdateLogicalDecodingStatusEndOfRecovery(), we have multiple things
> that seems to be necessary to perform before backends are allowed to
> write. For example, refer to comments: "If any of the critical GUCs
> have changed, log them before we allow backends to write WAL.*/. I
> think the key thing is that before we set state DB_IN_PRODUCTION in
> ControlFile and mark SharedRecoverstate as RECOVERY_STATE_DONE,
> backends shouldn't be allowed to write WAL. If we want to take an
> exception for writing a WAL during slot_creation before the
> RECOVERY_STATE_DONE is set, we should analyze and explain in comments
> why it is okay to take this exception.

Agreed.

As the discussion is becoming more complex, let me summarize our
discussion about the delay_status_change flag and lazy behavior.

The delay_status_change flag was created to handle a specific timing
issue: there's a brief window where backend processes can
enable/disable logical decoding but cannot write the STATUS_CHANGE
record. This occurs because after the startup process updates the
logical decoding status (in
UpdateLogicalDecodingStatusEndOfRecovery()), backend processes cannot
write WAL records until the startup sets SharedRecoveryState to
RECOVERY_STATE_DONE. The idea is to delay any logical decoding status
changes during this window until WAL writing is permitted system-wide.
An alternative idea being discussed is to allow an exception for
STATUS_CHANGE records, letting them be written even during this
window. While this alternative is simpler and technically feasible, it
could be risky as it breaks the general rule that 'backends cannot
write WAL records until recovery completes.'

When the process exits or raises an ERROR,  the process needs to clean
up temporary and ephemeral slots, which might disable logical
decoding. This deactivation process may involve waiting - either for
concurrent activation/deactivation processes to finish or due to the
delay_status_flag (if implemented). However, waiting during user-level
cleanup (in before_shmem_exit callbacks) isn't ideal since the process
blocks all interrupts. To address this, we introduced lazy behavior,
which delegates the deactivation process to the checkpointer, allowing
it to disable logical decoding asynchronously. This way, the
deactivation during user-level cleanup only needs to disable logical
decoding in shared memory and send signals.

While we've discussed that if we don't use the idea of the
delay_status_flag we don't need the lazy behavior either, I find that
we still need lazy behavior to handle waits during concurrent status
changes. Moreover, since we need lazy behavior anyway, the benefits of
implementing the exception-based approach seem limited.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com