Обсуждение: Unexpected Standby Shutdown on sync_replication_slots change
Hello,
I'm not sure if it's a bug but I've encountered an unexpected behavior when dynamically changing the sync_replication_slots
parameter on a PostgreSQL 17 standby server. Instead of logging an error and continuing to run, the standby instance shuts down with a FATAL
error, which is not the anticipated behavior for a dynamic parameter change, especially when the documentation doesn't indicate such an outcome.
Steps to Reproduce
Set up a physical replication between two PostgreSQL 17.5 instances.
Ensure
wal_level
on the primary (and consequently on the standby) is set toreplica
.Start both the primary and standby instances, confirming replication is active.
On the standby instance, dynamically change the
sync_replication_slots
parameter (I have run the following query:ALTER SYSTEM SET sync_replication_slots = 'on';
followed bySELECT pg_reload_conf();)
Expected Behavior
I expected the standby instance to continue running and log an error message (similar to how hot_standby_feedback
behaves when not enabled, e.g., a loop of LOG: replication slot synchronization requires "hot_standby_feedback" to be enabled
). A FATAL
error leading to an unexpected shutdown for a dynamic parameter change on a running standby is not the anticipated behavior. The documentation for sync_replication_slots
also doesn't indicate that a misconfiguration or incompatible wal_level
would lead to a shutdown.
Actual Behavior
Upon attempting to set sync_replication_slots
to on
on the standby with wal_level
set to replica
, the standby instance immediately shuts down with the following log messages:LOG: database system is ready to accept read-only connections
LOG: started streaming WAL from primary at 0/3000000 on timeline 1
LOG: received SIGHUP, reloading configuration files
LOG: parameter "sync_replication_slots" changed to "on"
FATAL: replication slot synchronization requires "wal_level" >= "logical"
Environment
PostgreSQL Version: 17.5
On Thu, Jul 24, 2025 at 10:54 PM Hugo DUBOIS <hdubois@scaleway.com> wrote: > > Hello, > > I'm not sure if it's a bug but I've encountered an unexpected behavior when dynamically changing the sync_replication_slotsparameter on a PostgreSQL 17 standby server. Instead of logging an error and continuing to run, thestandby instance shuts down with a FATAL error, which is not the anticipated behavior for a dynamic parameter change,especially when the documentation doesn't indicate such an outcome. > > Steps to Reproduce > > Set up a physical replication between two PostgreSQL 17.5 instances. > > Ensure wal_level on the primary (and consequently on the standby) is set to replica. > > Start both the primary and standby instances, confirming replication is active. > > On the standby instance, dynamically change the sync_replication_slots parameter (I have run the following query: ALTERSYSTEM SET sync_replication_slots = 'on'; followed by SELECT pg_reload_conf();) > > Expected Behavior > > I expected the standby instance to continue running and log an error message (similar to how hot_standby_feedback behaveswhen not enabled, e.g., a loop of LOG: replication slot synchronization requires "hot_standby_feedback" to be enabled).A FATAL error leading to an unexpected shutdown for a dynamic parameter change on a running standby is not the anticipatedbehavior. The documentation for sync_replication_slots also doesn't indicate that a misconfiguration or incompatiblewal_level would lead to a shutdown. > > Actual Behavior > > Upon attempting to set sync_replication_slots to on on the standby with wal_level set to replica, the standby instanceimmediately shuts down with the following log messages: > > LOG: database system is ready to accept read-only connections > LOG: started streaming WAL from primary at 0/3000000 on timeline 1 > LOG: received SIGHUP, reloading configuration files > LOG: parameter "sync_replication_slots" changed to "on" > FATAL: replication slot synchronization requires "wal_level" >= "logical" > > Environment > > PostgreSQL Version: 17.5 Thanks for the report! I was able to reproduce the issue even on the latest master (v19dev). I agree that the current behavior—where changing a GUC parameter can cause the server to shut down—is unexpected and should be avoided. From what I’ve seen in the code, the problem stems from postmaster calling ValidateSlotSyncParams() before starting the slot sync worker. That function raises an ERROR if wal_level is not logical while sync_replication_slots is enabled. Since ERROR is treated as FATAL in postmaster, it causes the server to exit. To fix this, we could modify ValidateSlotSyncParams() so it doesn’t raise an ERROR in this case, as follows. ValidateSlotSyncParams(int elevel) { /* * Logical slot sync/creation requires wal_level >= logical. - * - * Since altering the wal_level requires a server restart, so error out in - * this case regardless of elevel provided by caller. */ if (wal_level < WAL_LEVEL_LOGICAL) - ereport(ERROR, + { + ereport(elevel, errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg("replication slot synchronization requires \"wal_level\" >= \"logical\"")); + return false; + } Regards, -- Fujii Masao
On Fri, Jul 25, 2025 at 12:55 AM Fujii Masao <masao.fujii@gmail.com> wrote: > > On Thu, Jul 24, 2025 at 10:54 PM Hugo DUBOIS <hdubois@scaleway.com> wrote: > > > > Hello, > > > > I'm not sure if it's a bug but I've encountered an unexpected behavior when dynamically changing the sync_replication_slotsparameter on a PostgreSQL 17 standby server. Instead of logging an error and continuing to run, thestandby instance shuts down with a FATAL error, which is not the anticipated behavior for a dynamic parameter change,especially when the documentation doesn't indicate such an outcome. > > > > Steps to Reproduce > > > > Set up a physical replication between two PostgreSQL 17.5 instances. > > > > Ensure wal_level on the primary (and consequently on the standby) is set to replica. > > > > Start both the primary and standby instances, confirming replication is active. > > > > On the standby instance, dynamically change the sync_replication_slots parameter (I have run the following query: ALTERSYSTEM SET sync_replication_slots = 'on'; followed by SELECT pg_reload_conf();) > > > > Expected Behavior > > > > I expected the standby instance to continue running and log an error message (similar to how hot_standby_feedback behaveswhen not enabled, e.g., a loop of LOG: replication slot synchronization requires "hot_standby_feedback" to be enabled).A FATAL error leading to an unexpected shutdown for a dynamic parameter change on a running standby is not the anticipatedbehavior. The documentation for sync_replication_slots also doesn't indicate that a misconfiguration or incompatiblewal_level would lead to a shutdown. > > > > Actual Behavior > > > > Upon attempting to set sync_replication_slots to on on the standby with wal_level set to replica, the standby instanceimmediately shuts down with the following log messages: > > > > LOG: database system is ready to accept read-only connections > > LOG: started streaming WAL from primary at 0/3000000 on timeline 1 > > LOG: received SIGHUP, reloading configuration files > > LOG: parameter "sync_replication_slots" changed to "on" > > FATAL: replication slot synchronization requires "wal_level" >= "logical" > > > > Environment > > > > PostgreSQL Version: 17.5 > > Thanks for the report! > > I was able to reproduce the issue even on the latest master (v19dev). > I agree that the current behavior—where changing a GUC parameter can > cause the server to shut down—is unexpected and should be avoided. > > From what I’ve seen in the code, the problem stems from postmaster > calling ValidateSlotSyncParams() before starting the slot sync worker. > That function raises an ERROR if wal_level is not logical while > sync_replication_slots is enabled. Since ERROR is treated as FATAL > in postmaster, it causes the server to exit. > > To fix this, we could modify ValidateSlotSyncParams() so it doesn’t > raise an ERROR in this case, as follows. > > ValidateSlotSyncParams(int elevel) > { > /* > * Logical slot sync/creation requires wal_level >= logical. > - * > - * Since altering the wal_level requires a server restart, so error out in > - * this case regardless of elevel provided by caller. > */ > if (wal_level < WAL_LEVEL_LOGICAL) > - ereport(ERROR, > + { > + ereport(elevel, > errcode(ERRCODE_INVALID_PARAMETER_VALUE), > errmsg("replication slot synchronization requires \"wal_level\" >= > \"logical\"")); > + return false; > + } I've created a patch to implement the above—attached. Note that this patch does not change the existing behavior when the misconfiguration (sync_replication_slots enabled but wal_level not set to logical) is detected at server startup. In that case, the server still shuts down with a FATAL error, which is consistent with other settings like summarize_wal. Regards, -- Fujii Masao
Вложения
On Fri, Jul 25, 2025 at 12:20 AM Fujii Masao <masao.fujii@gmail.com> wrote: > > On Fri, Jul 25, 2025 at 12:55 AM Fujii Masao <masao.fujii@gmail.com> wrote: > > > > On Thu, Jul 24, 2025 at 10:54 PM Hugo DUBOIS <hdubois@scaleway.com> wrote: > > > > > > Hello, > > > > > > I'm not sure if it's a bug but I've encountered an unexpected behavior when dynamically changing the sync_replication_slotsparameter on a PostgreSQL 17 standby server. Instead of logging an error and continuing to run, thestandby instance shuts down with a FATAL error, which is not the anticipated behavior for a dynamic parameter change,especially when the documentation doesn't indicate such an outcome. > > > > > > Steps to Reproduce > > > > > > Set up a physical replication between two PostgreSQL 17.5 instances. > > > > > > Ensure wal_level on the primary (and consequently on the standby) is set to replica. > > > > > > Start both the primary and standby instances, confirming replication is active. > > > > > > On the standby instance, dynamically change the sync_replication_slots parameter (I have run the following query: ALTERSYSTEM SET sync_replication_slots = 'on'; followed by SELECT pg_reload_conf();) > > > > > > Expected Behavior > > > > > > I expected the standby instance to continue running and log an error message (similar to how hot_standby_feedback behaveswhen not enabled, e.g., a loop of LOG: replication slot synchronization requires "hot_standby_feedback" to be enabled).A FATAL error leading to an unexpected shutdown for a dynamic parameter change on a running standby is not the anticipatedbehavior. The documentation for sync_replication_slots also doesn't indicate that a misconfiguration or incompatiblewal_level would lead to a shutdown. > > > > > > Actual Behavior > > > > > > Upon attempting to set sync_replication_slots to on on the standby with wal_level set to replica, the standby instanceimmediately shuts down with the following log messages: > > > > > > LOG: database system is ready to accept read-only connections > > > LOG: started streaming WAL from primary at 0/3000000 on timeline 1 > > > LOG: received SIGHUP, reloading configuration files > > > LOG: parameter "sync_replication_slots" changed to "on" > > > FATAL: replication slot synchronization requires "wal_level" >= "logical" > > > > > > Environment > > > > > > PostgreSQL Version: 17.5 > > > > Thanks for the report! > > > > I was able to reproduce the issue even on the latest master (v19dev). > > I agree that the current behavior—where changing a GUC parameter can > > cause the server to shut down—is unexpected and should be avoided. > > > > From what I’ve seen in the code, the problem stems from postmaster > > calling ValidateSlotSyncParams() before starting the slot sync worker. > > That function raises an ERROR if wal_level is not logical while > > sync_replication_slots is enabled. Since ERROR is treated as FATAL > > in postmaster, it causes the server to exit. > > > > To fix this, we could modify ValidateSlotSyncParams() so it doesn’t > > raise an ERROR in this case, as follows. > > > > ValidateSlotSyncParams(int elevel) > > { > > /* > > * Logical slot sync/creation requires wal_level >= logical. > > - * > > - * Since altering the wal_level requires a server restart, so error out in > > - * this case regardless of elevel provided by caller. > > */ > > if (wal_level < WAL_LEVEL_LOGICAL) > > - ereport(ERROR, > > + { > > + ereport(elevel, > > errcode(ERRCODE_INVALID_PARAMETER_VALUE), > > errmsg("replication slot synchronization requires \"wal_level\" >= > > \"logical\"")); > > + return false; > > + } > > I've created a patch to implement the above—attached. Thank You for the patch. > Note that this patch does not change the existing behavior when > the misconfiguration (sync_replication_slots enabled but wal_level not > set to logical) is detected at server startup. In that case, the server > still shuts down with a FATAL error, which is consistent with other > settings like summarize_wal. > Validated the behaviour, the patch looks good to me. thanks Shveta
Thanks for the quick patch. I've tested it on the REL_17_STABLE
branch, and it's working fine.
On a standby node, after dynamically setting the sync_replication_slots
parameter, I observed the following logs. The instance did not shut down, which seems correct:2025-07-25 09:26:34.613 UTC [4420] LOG: received SIGHUP, reloading configuration files
2025-07-25 09:26:34.614 UTC [4420] LOG: parameter "sync_replication_slots" changed to "on"
2025-07-25 09:26:34.615 UTC [4420] LOG: replication slot synchronization requires "wal_level" >= "logical"
2025-07-25 09:27:34.662 UTC [4420] LOG: replication slot synchronization requires "wal_level" >= "logical"
The instance did not restart as expected, showing this fatal log:2025-07-25 09:27:45.668 UTC [4430] FATAL: replication slot synchronization ("sync_replication_slots" = on) requires "wal_level" >= "logical"
I have a couple of observations:
With this patch, a primary instance will not restart if the configuration is incorrect.
Only
wal_level
is checked, but theValidateSlotSyncParams
function includes other mandatory parameters. These are not being checked during startup.
On Fri, Jul 25, 2025 at 12:20 AM Fujii Masao <masao.fujii@gmail.com> wrote:
>
> On Fri, Jul 25, 2025 at 12:55 AM Fujii Masao <masao.fujii@gmail.com> wrote:
> >
> > On Thu, Jul 24, 2025 at 10:54 PM Hugo DUBOIS <hdubois@scaleway.com> wrote:
> > >
> > > Hello,
> > >
> > > I'm not sure if it's a bug but I've encountered an unexpected behavior when dynamically changing the sync_replication_slots parameter on a PostgreSQL 17 standby server. Instead of logging an error and continuing to run, the standby instance shuts down with a FATAL error, which is not the anticipated behavior for a dynamic parameter change, especially when the documentation doesn't indicate such an outcome.
> > >
> > > Steps to Reproduce
> > >
> > > Set up a physical replication between two PostgreSQL 17.5 instances.
> > >
> > > Ensure wal_level on the primary (and consequently on the standby) is set to replica.
> > >
> > > Start both the primary and standby instances, confirming replication is active.
> > >
> > > On the standby instance, dynamically change the sync_replication_slots parameter (I have run the following query: ALTER SYSTEM SET sync_replication_slots = 'on'; followed by SELECT pg_reload_conf();)
> > >
> > > Expected Behavior
> > >
> > > I expected the standby instance to continue running and log an error message (similar to how hot_standby_feedback behaves when not enabled, e.g., a loop of LOG: replication slot synchronization requires "hot_standby_feedback" to be enabled). A FATAL error leading to an unexpected shutdown for a dynamic parameter change on a running standby is not the anticipated behavior. The documentation for sync_replication_slots also doesn't indicate that a misconfiguration or incompatible wal_level would lead to a shutdown.
> > >
> > > Actual Behavior
> > >
> > > Upon attempting to set sync_replication_slots to on on the standby with wal_level set to replica, the standby instance immediately shuts down with the following log messages:
> > >
> > > LOG: database system is ready to accept read-only connections
> > > LOG: started streaming WAL from primary at 0/3000000 on timeline 1
> > > LOG: received SIGHUP, reloading configuration files
> > > LOG: parameter "sync_replication_slots" changed to "on"
> > > FATAL: replication slot synchronization requires "wal_level" >= "logical"
> > >
> > > Environment
> > >
> > > PostgreSQL Version: 17.5
> >
> > Thanks for the report!
> >
> > I was able to reproduce the issue even on the latest master (v19dev).
> > I agree that the current behavior—where changing a GUC parameter can
> > cause the server to shut down—is unexpected and should be avoided.
> >
> > From what I’ve seen in the code, the problem stems from postmaster
> > calling ValidateSlotSyncParams() before starting the slot sync worker.
> > That function raises an ERROR if wal_level is not logical while
> > sync_replication_slots is enabled. Since ERROR is treated as FATAL
> > in postmaster, it causes the server to exit.
> >
> > To fix this, we could modify ValidateSlotSyncParams() so it doesn’t
> > raise an ERROR in this case, as follows.
> >
> > ValidateSlotSyncParams(int elevel)
> > {
> > /*
> > * Logical slot sync/creation requires wal_level >= logical.
> > - *
> > - * Since altering the wal_level requires a server restart, so error out in
> > - * this case regardless of elevel provided by caller.
> > */
> > if (wal_level < WAL_LEVEL_LOGICAL)
> > - ereport(ERROR,
> > + {
> > + ereport(elevel,
> > errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> > errmsg("replication slot synchronization requires \"wal_level\" >=
> > \"logical\""));
> > + return false;
> > + }
>
> I've created a patch to implement the above—attached.
Thank You for the patch.
> Note that this patch does not change the existing behavior when
> the misconfiguration (sync_replication_slots enabled but wal_level not
> set to logical) is detected at server startup. In that case, the server
> still shuts down with a FATAL error, which is consistent with other
> settings like summarize_wal.
>
Validated the behaviour, the patch looks good to me.
thanks
Shveta
Cordialement,
|
On Fri, Jul 25, 2025 at 6:40 PM Hugo DUBOIS <hdubois@scaleway.com> wrote: > > Thanks for the quick patch. I've tested it on the REL_17_STABLE branch, and it's working fine. > > On a standby node, after dynamically setting the sync_replication_slots parameter, I observed the following logs. The instancedid not shut down, which seems correct: > > 2025-07-25 09:26:34.613 UTC [4420] LOG: received SIGHUP, reloading configuration files > 2025-07-25 09:26:34.614 UTC [4420] LOG: parameter "sync_replication_slots" changed to "on" > 2025-07-25 09:26:34.615 UTC [4420] LOG: replication slot synchronization requires "wal_level" >= "logical" > 2025-07-25 09:27:34.662 UTC [4420] LOG: replication slot synchronization requires "wal_level" >= "logical" > > The instance did not restart as expected, showing this fatal log: > > 2025-07-25 09:27:45.668 UTC [4430] FATAL: replication slot synchronization ("sync_replication_slots" = on) requires "wal_level">= "logical" Thanks for testing! > I have a couple of observations: > > With this patch, a primary instance will not restart if the configuration is incorrect. Right. In HEAD, the server shuts down due to the misconfiguration only when starting up as a standby and entering hot standby mode. With this patch, the server shuts down earlier—during startup—even when starting as a primary. If starting as a standby, the shutdown now happens before entering hot standby. It seems hard to match the current behavior exactly in the patch, and I’m not sure it’s worth trying. I think we have two options: #1. Make the server always shut down on misconfiguration, regardless of whether it starts as a primary or standby, and before reaching hot standby. #2. Allow the server to start even if the configuration is invalid. The current patch follows option #1. But if there are valid use cases for starting a primary with sync_replication_slots enabled and wal_level not set to logical, then option #2 might be better. What do you think? > Only wal_level is checked, but the ValidateSlotSyncParams function includes other mandatory parameters. These are not beingchecked during startup. The other parameters checked by ValidateSlotSyncParams() can be changed without restarting the server, so users can fix them on the fly while the server is running with the misconfiguration. But wal_level requires a server restart, so it's important to catch that upfront. Regards, -- Fujii Masao
On Fri, Jul 25, 2025 at 6:40 PM Hugo DUBOIS <hdubois@scaleway.com> wrote:
>
> Thanks for the quick patch. I've tested it on the REL_17_STABLE branch, and it's working fine.
>
> On a standby node, after dynamically setting the sync_replication_slots parameter, I observed the following logs. The instance did not shut down, which seems correct:
>
> 2025-07-25 09:26:34.613 UTC [4420] LOG: received SIGHUP, reloading configuration files
> 2025-07-25 09:26:34.614 UTC [4420] LOG: parameter "sync_replication_slots" changed to "on"
> 2025-07-25 09:26:34.615 UTC [4420] LOG: replication slot synchronization requires "wal_level" >= "logical"
> 2025-07-25 09:27:34.662 UTC [4420] LOG: replication slot synchronization requires "wal_level" >= "logical"
>
> The instance did not restart as expected, showing this fatal log:
>
> 2025-07-25 09:27:45.668 UTC [4430] FATAL: replication slot synchronization ("sync_replication_slots" = on) requires "wal_level" >= "logical"
Thanks for testing!
> I have a couple of observations:
>
> With this patch, a primary instance will not restart if the configuration is incorrect.
Right. In HEAD, the server shuts down due to the misconfiguration only when
starting up as a standby and entering hot standby mode. With this patch,
the server shuts down earlier—during startup—even when starting as a primary.
If starting as a standby, the shutdown now happens before entering hot standby.
It seems hard to match the current behavior exactly in the patch, and I’m not
sure it’s worth trying. I think we have two options:
#1. Make the server always shut down on misconfiguration, regardless of
whether it starts as a primary or standby, and before reaching hot standby.
#2. Allow the server to start even if the configuration is invalid.
The current patch follows option #1. But if there are valid use cases for
starting a primary with sync_replication_slots enabled and wal_level not set
to logical, then option #2 might be better. What do you think?
wal_level
and sync_replication_slots
not matching on a primary. So, for me, Option 1 seems correct.> Only wal_level is checked, but the ValidateSlotSyncParams function includes other mandatory parameters. These are not being checked during startup.
The other parameters checked by ValidateSlotSyncParams() can be changed
without restarting the server, so users can fix them on the fly while
the server is running with the misconfiguration. But wal_level requires
a server restart, so it's important to catch that upfront.
Regards,
--
Fujii Masao
On Fri, Jul 25, 2025 at 9:13 PM Hugo DUBOIS <hdubois@scaleway.com> wrote: > I'm not sure if there's a particular use case for wal_level and sync_replication_slots not matching on a primary. So, forme, Option 1 seems correct. I also prefer option #1. However, on second thought, if some users are already running a server (non-standby) with sync_replication_slots enabled and wal_level != logical in v17, switching to option #1 could break their setup after a minor version update. That would be surprising and confusing. To avoid that, I think we should go with option #2—at least for v17. Attached is an updated patch implementing option #2. Regards, -- Fujii Masao
Вложения
On Fri, Jul 25, 2025 at 9:13 PM Hugo DUBOIS <hdubois@scaleway.com> wrote:
> I'm not sure if there's a particular use case for wal_level and sync_replication_slots not matching on a primary. So, for me, Option 1 seems correct.
I also prefer option #1.
However, on second thought, if some users are already running a server
(non-standby) with sync_replication_slots enabled and wal_level != logical
in v17, switching to option #1 could break their setup after a minor
version update. That would be surprising and confusing.
To avoid that, I think we should go with option #2—at least for v17.
Attached is an updated patch implementing option #2.
Regards,
--
Fujii Masao
On Fri, Jul 25, 2025 at 9:31 PM Fujii Masao <masao.fujii@gmail.com> wrote: > > On Fri, Jul 25, 2025 at 9:13 PM Hugo DUBOIS <hdubois@scaleway.com> wrote: > > I'm not sure if there's a particular use case for wal_level and sync_replication_slots not matching on a primary. So,for me, Option 1 seems correct. > > I also prefer option #1. > > However, on second thought, if some users are already running a server > (non-standby) with sync_replication_slots enabled and wal_level != logical > in v17, switching to option #1 could break their setup after a minor > version update. That would be surprising and confusing. > > To avoid that, I think we should go with option #2—at least for v17. > I still prefer option #1. On HEAD, option #1 makes sense because wal_level is a GUC that requires a server restart and thus using ERROR/FATAL in this context is appropriate. It is also consistent with the rest of the code (other modules) wherever wal_level checks are performed during startup. Regarding the back branches, in the case of a primary server, if sync_replication_slots is left ON while wal_level < logical, then issuing a ERROR/FATAL is still more suitable. The user can simply correct the configuration (disable sync_replication_slots) and proceed with the startup after the minor version upgrade. Also, given that option #1 is the better fit for HEAD, I don't think it's worth having different behavior in the back branches. Thoughts? thanks Shveta
On Mon, Jul 28, 2025 at 10:18 AM shveta malik <shveta.malik@gmail.com> wrote: > > On Fri, Jul 25, 2025 at 9:31 PM Fujii Masao <masao.fujii@gmail.com> wrote: > > > > On Fri, Jul 25, 2025 at 9:13 PM Hugo DUBOIS <hdubois@scaleway.com> wrote: > > > I'm not sure if there's a particular use case for wal_level and sync_replication_slots not matching on a primary. So,for me, Option 1 seems correct. > > > > I also prefer option #1. > > > > However, on second thought, if some users are already running a server > > (non-standby) with sync_replication_slots enabled and wal_level != logical > > in v17, switching to option #1 could break their setup after a minor > > version update. That would be surprising and confusing. > > > > To avoid that, I think we should go with option #2—at least for v17. > > > > I still prefer option #1. On HEAD, option #1 makes sense because > wal_level is a GUC that requires a server restart and thus using > ERROR/FATAL in this context is appropriate. It is also consistent with > the rest of the code (other modules) wherever wal_level checks are > performed during startup. > > Regarding the back branches, in the case of a primary server, if > sync_replication_slots is left ON while wal_level < logical, then > issuing a ERROR/FATAL is still more suitable. The user can simply > correct the configuration (disable sync_replication_slots) and proceed > with the startup after the minor version upgrade. Also, given that > option #1 is the better fit for HEAD, I don't think it's worth having > different behavior in the back branches. > +1. -- With Regards, Amit Kapila.
On Mon, Jul 28, 2025 at 6:55 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Mon, Jul 28, 2025 at 10:18 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > On Fri, Jul 25, 2025 at 9:31 PM Fujii Masao <masao.fujii@gmail.com> wrote: > > > > > > On Fri, Jul 25, 2025 at 9:13 PM Hugo DUBOIS <hdubois@scaleway.com> wrote: > > > > I'm not sure if there's a particular use case for wal_level and sync_replication_slots not matching on a primary.So, for me, Option 1 seems correct. > > > > > > I also prefer option #1. > > > > > > However, on second thought, if some users are already running a server > > > (non-standby) with sync_replication_slots enabled and wal_level != logical > > > in v17, switching to option #1 could break their setup after a minor > > > version update. That would be surprising and confusing. > > > > > > To avoid that, I think we should go with option #2—at least for v17. > > > > > > > I still prefer option #1. On HEAD, option #1 makes sense because > > wal_level is a GUC that requires a server restart and thus using > > ERROR/FATAL in this context is appropriate. It is also consistent with > > the rest of the code (other modules) wherever wal_level checks are > > performed during startup. > > > > Regarding the back branches, in the case of a primary server, if > > sync_replication_slots is left ON while wal_level < logical, then > > issuing a ERROR/FATAL is still more suitable. The user can simply > > correct the configuration (disable sync_replication_slots) and proceed > > with the startup after the minor version upgrade. Also, given that > > option #1 is the better fit for HEAD, I don't think it's worth having > > different behavior in the back branches. > > > > +1. I think it's basically not acceptable for a server to start up successfully before a minor version update, but then fail to start with the same configuration after the update. If that's absolutely necessary to fix a bug, it might be justifiable. But in this case, I don't think it's required. Blocking startup when sync_replication_slots is enabled and wal_level is not logical could be helpful. But that feels more like an improvement than a bug fix. I'm fine adding that to master, but I don't think we should apply it to old branches. That said, both Shveta and Amit support backpatching this change, so I'd like to hear more opinions before we decide. Regards, -- Fujii Masao
On Tue, 2025-07-29 at 13:40 +0900, Fujii Masao wrote: > I think it's basically not acceptable for a server to start up > successfully before a minor version update, but then fail to start with > the same configuration after the update. If that's absolutely necessary > to fix a bug, it might be justifiable. But in this case, I don't think > it's required. > > Blocking startup when sync_replication_slots is enabled and wal_level > is not logical could be helpful. But that feels more like an improvement > than a bug fix. I'm fine adding that to master, but I don't think we should > apply it to old branches. > > That said, both Shveta and Amit support backpatching this change, > so I'd like to hear more opinions before we decide. I side with you on that one. I think it would be fine to backpatch the breaking change to v18, if we mention the behavior change in the release notes. Yours, Laurenz Albe
On Tue, Jul 29, 2025 at 12:04 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote: > > On Tue, 2025-07-29 at 13:40 +0900, Fujii Masao wrote: > > I think it's basically not acceptable for a server to start up > > successfully before a minor version update, but then fail to start with > > the same configuration after the update. If that's absolutely necessary > > to fix a bug, it might be justifiable. But in this case, I don't think > > it's required. > > > > Blocking startup when sync_replication_slots is enabled and wal_level > > is not logical could be helpful. But that feels more like an improvement > > than a bug fix. I'm fine adding that to master, but I don't think we should > > apply it to old branches. > > > > That said, both Shveta and Amit support backpatching this change, > > so I'd like to hear more opinions before we decide. > > I side with you on that one. > > I think it would be fine to backpatch the breaking change to v18, > if we mention the behavior change in the release notes. > I am fine with the suggestion. thanks Shveta
Hi, On Wed, Jul 30, 2025 at 11:15 AM shveta malik <shveta.malik@gmail.com> wrote: > > On Tue, Jul 29, 2025 at 12:04 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote: > > > > On Tue, 2025-07-29 at 13:40 +0900, Fujii Masao wrote: > > > I think it's basically not acceptable for a server to start up > > > successfully before a minor version update, but then fail to start with > > > the same configuration after the update. If that's absolutely necessary > > > to fix a bug, it might be justifiable. But in this case, I don't think > > > it's required. > > > > > > Blocking startup when sync_replication_slots is enabled and wal_level > > > is not logical could be helpful. But that feels more like an improvement > > > than a bug fix. I'm fine adding that to master, but I don't think we should > > > apply it to old branches. > > > > > > That said, both Shveta and Amit support backpatching this change, > > > so I'd like to hear more opinions before we decide. > > > > I side with you on that one. > > > > I think it would be fine to backpatch the breaking change to v18, > > if we mention the behavior change in the release notes. > > > > I am fine with the suggestion. + 1 This fix does feel more like an enhancement to user feedback and safety nets, not a correction of a broken or unsafe code path. Unable to start up with the same settings for it might cause more-than-necessary surprise. Best, Xuneng
On Sat, Aug 2, 2025 at 1:28 PM Xuneng Zhou <xunengzhou@gmail.com> wrote: > This fix does feel more like an enhancement to user feedback and > safety nets, not a correction of a broken or unsafe code path. Unable > to start up with the same settings for it might cause > more-than-necessary surprise. Agreed. At first, since there seems to be rough consensus on applying the current patch (which prevents unexpected shutdown while the server is running), I've pushed it to master and back-patched it to v17. As for the follow-up change that prevents the server from starting with an invalid configuration, let's continue the discussion. I've attached the patch and agree it's a good fit for master. However, I'm afraid it's too late to include it in v18, as beta2 has already been released and this change is more of an improvement than a bug fix. Regards, -- Fujii Masao
Вложения
On Mon, Aug 4, 2025 at 5:29 PM Fujii Masao <masao.fujii@gmail.com> wrote: > > On Sat, Aug 2, 2025 at 1:28 PM Xuneng Zhou <xunengzhou@gmail.com> wrote: > > This fix does feel more like an enhancement to user feedback and > > safety nets, not a correction of a broken or unsafe code path. Unable > > to start up with the same settings for it might cause > > more-than-necessary surprise. > > Agreed. > > At first, since there seems to be rough consensus on applying the current patch > (which prevents unexpected shutdown while the server is running), > I've pushed it to master and back-patched it to v17. > > As for the follow-up change that prevents the server from starting with > an invalid configuration, let's continue the discussion. I've attached > the patch and agree it's a good fit for master. +1. The patch looks good. > However, I'm afraid > it's too late to include it in v18, as beta2 has already been released > and this change is more of an improvement than a bug fix. > If it's not possible to get it to v18 now, then I am okay to have it on master alone. thanks Shveta
On Tue, Aug 5, 2025 at 10:38 AM shveta malik <shveta.malik@gmail.com> wrote: > > On Mon, Aug 4, 2025 at 5:29 PM Fujii Masao <masao.fujii@gmail.com> wrote: > > > > On Sat, Aug 2, 2025 at 1:28 PM Xuneng Zhou <xunengzhou@gmail.com> wrote: > > > This fix does feel more like an enhancement to user feedback and > > > safety nets, not a correction of a broken or unsafe code path. Unable > > > to start up with the same settings for it might cause > > > more-than-necessary surprise. > > > > Agreed. > > > > At first, since there seems to be rough consensus on applying the current patch > > (which prevents unexpected shutdown while the server is running), > > I've pushed it to master and back-patched it to v17. > > > > As for the follow-up change that prevents the server from starting with > > an invalid configuration, let's continue the discussion. I've attached > > the patch and agree it's a good fit for master. > > +1. The patch looks good. > > > However, I'm afraid > > it's too late to include it in v18, as beta2 has already been released > > and this change is more of an improvement than a bug fix. > > > > If it's not possible to get it to v18 now, then I am okay to have it > on master alone. > Just wanted to confirm, do we plan to do it on master at-least? thanks Shveta
On Wed, Aug 20, 2025 at 7:16 PM shveta malik <shveta.malik@gmail.com> wrote: > Just wanted to confirm, do we plan to do it on master at-least? Yeah, barring any objections, I'll push the patch to the master. Regards, -- Fujii Masao
On Thu, Aug 21, 2025 at 7:41 AM Fujii Masao <masao.fujii@gmail.com> wrote: > > On Wed, Aug 20, 2025 at 7:16 PM shveta malik <shveta.malik@gmail.com> wrote: > > Just wanted to confirm, do we plan to do it on master at-least? > > Yeah, barring any objections, I'll push the patch to the master. > Okay, thanks!
On Thu, Aug 21, 2025 at 1:29 PM shveta malik <shveta.malik@gmail.com> wrote: > > On Thu, Aug 21, 2025 at 7:41 AM Fujii Masao <masao.fujii@gmail.com> wrote: > > > > On Wed, Aug 20, 2025 at 7:16 PM shveta malik <shveta.malik@gmail.com> wrote: > > > Just wanted to confirm, do we plan to do it on master at-least? > > > > Yeah, barring any objections, I'll push the patch to the master. > > > > Okay, thanks! I've pushed the patch to master. Thanks! Regards, -- Fujii Masao