Обсуждение: Invalid primary_slot_name triggers warnings in all processes on reload
Hi, While reviewing the patch at [1], I noticed that if primary_slot_name is set to an invalid slot name in postgresql.conf and the configuration file is reloaded, all running postgres processes emit the WARNING message as follows. Isn't this a bug? To fix this issue, GUC_check_errmsg() should be used instead of ereport() when an invalid slot name is found in the setting. Thoughts? ------------------------------------ $ initdb -D data $ pg_ctl -D data start $ psql <<EOF ALTER SYSTEM SET log_line_prefix TO '[%b] '; SELECT pg_reload_conf(); \! echo "primary_slot_name = 'invalid-'" >> data/postgresql.conf SELECT pg_reload_conf(); EOF [postmaster] WARNING: replication slot name "invalid-" contains invalid character [postmaster] HINT: Replication slot names may only contain lower case letters, numbers, and the underscore character. [postmaster] LOG: invalid value for parameter "primary_slot_name": "invalid-" [postmaster] LOG: configuration file "/System/Volumes/Data/dav/head-pgsql/data/postgresql.conf" contains errors; unaffected changes were applied [logical replication launcher] WARNING: replication slot name "invalid-" contains invalid character [logical replication launcher] HINT: Replication slot names may only contain lower case letters, numbers, and the underscore character. [checkpointer] WARNING: replication slot name "invalid-" contains invalid character [checkpointer] HINT: Replication slot names may only contain lower case letters, numbers, and the underscore character. [walwriter] WARNING: replication slot name "invalid-" contains invalid character [walwriter] HINT: Replication slot names may only contain lower case letters, numbers, and the underscore character. hrk:head-pgsql postgres$ [background writer] WARNING: replication slot name "invalid-" contains invalid character [background writer] HINT: Replication slot names may only contain lower case letters, numbers, and the underscore character. [io worker] WARNING: replication slot name "invalid-" contains invalid character [io worker] HINT: Replication slot names may only contain lower case letters, numbers, and the underscore character. [io worker] WARNING: replication slot name "invalid-" contains invalid character [io worker] HINT: Replication slot names may only contain lower case letters, numbers, and the underscore character. [autovacuum launcher] WARNING: replication slot name "invalid-" contains invalid character [autovacuum launcher] HINT: Replication slot names may only contain lower case letters, numbers, and the underscore character. [io worker] WARNING: replication slot name "invalid-" contains invalid character [io worker] HINT: Replication slot names may only contain lower case letters, numbers, and the underscore character. ------------------------------------ Regards, [1] http://postgr.es/m/CANhcyEUUWJ7XBycqFuqQg=eZ_==KT6S7E+rdMy-GE23oLa8tmQ@mail.gmail.com -- Fujii Masao
On Fri, Sep 12, 2025 at 9:12 PM Fujii Masao <masao.fujii@gmail.com> wrote: > > Hi, > > While reviewing the patch at [1], I noticed that if primary_slot_name is > set to an invalid slot name in postgresql.conf and the configuration file > is reloaded, all running postgres processes emit the WARNING message > as follows. Isn't this a bug? > > To fix this issue, GUC_check_errmsg() should be used instead of ereport() > when an invalid slot name is found in the setting. Thoughts? Another simple fix would be to have processes other than the postmaster report invalid primary_slot_name at DEBUG3 instead of WARNING. In that case, only the postmaster reports it at WARNING, so by default only that message appears in the log file. This matches the behavior for other GUC parameters with invalid settings. I've attached a patch implementing this approach. Thought? -- Fujii Masao
Вложения
On Thu, Sep 18, 2025 at 10:54 PM Fujii Masao <masao.fujii@gmail.com> wrote: > > On Fri, Sep 12, 2025 at 9:12 PM Fujii Masao <masao.fujii@gmail.com> wrote: > > > > Hi, > > > > While reviewing the patch at [1], I noticed that if primary_slot_name is > > set to an invalid slot name in postgresql.conf and the configuration file > > is reloaded, all running postgres processes emit the WARNING message > > as follows. Isn't this a bug? > > > > To fix this issue, GUC_check_errmsg() should be used instead of ereport() > > when an invalid slot name is found in the setting. Thoughts? > > Another simple fix would be to have processes other than the postmaster > report invalid primary_slot_name at DEBUG3 instead of WARNING. > In that case, only the postmaster reports it at WARNING, so by default > only that message appears in the log file. This matches the behavior for > other GUC parameters with invalid settings. > > I've attached a patch implementing this approach. Thought? This approach unexpectedly suppresses the log message about an invalid primary_slot_name when it's set with ALTER SYSTEM SET, so it isn't suitable. Instead, we should use GUC_check_xxx() functions for error reporting in the GUC check hook. I've attached a patch that updates the check hook for primary_slot_name to use GUC_check_errdetail() and GUC_check_errhint(). As with other GUC parameters, this makes non-postmaster processes log at DEBUG3, so by default only the postmaster's message appears in the log file. Regards, -- Fujii Masao
Вложения
On Sep 19, 2025, at 00:48, Fujii Masao <masao.fujii@gmail.com> wrote:Fujii Masao
<v2-0001-Make-invalid-primary_slot_name-follow-standard-GU.patch>
+error:
+ if (elevel == 0)
+ {
+ GUC_check_errdetail("%s", err_msg);
+ if (err_hint != NULL)
+ GUC_check_errhint("%s", err_hint);
+ }
+ else
+ ereport(elevel,
+ (errcode(err_code),
+ errmsg("%s", err_msg),
+ (err_hint != NULL) ? errhint("%s", err_hint) : 0));
+
+ return false;
```
Do we need to free memory pointed by err_msg and err_hint that were allocated from psprintf()? The code comment of psprintf() says caller is responsible for free the memory.
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
HighGo Software Co., Ltd.
https://www.highgo.com/
On Fri, Sep 19, 2025 at 8:21 AM Chao Li <li.evan.chao@gmail.com> wrote: > > > > On Sep 19, 2025, at 00:48, Fujii Masao <masao.fujii@gmail.com> wrote: > > Fujii Masao > <v2-0001-Make-invalid-primary_slot_name-follow-standard-GU.patch> > > > ``` > +error: > + if (elevel == 0) > + { > + GUC_check_errdetail("%s", err_msg); > + if (err_hint != NULL) > + GUC_check_errhint("%s", err_hint); > + } > + else > + ereport(elevel, > + (errcode(err_code), > + errmsg("%s", err_msg), > + (err_hint != NULL) ? errhint("%s", err_hint) : 0)); > + > + return false; > ``` > > Do we need to free memory pointed by err_msg and err_hint that were allocated from psprintf()? The code comment of psprintf()says caller is responsible for free the memory. Thanks for the review! I had thought that since the short-lived "config file processing" memory context is used when processing the configuration file and then freed afterward, there was no need for ReplicationSlotValidateName() to call pfree(). However, when it's called by StartupReorderBuffer() with elevel = DEBUG2, allocations seem to be made in TopMemoryContext, so they could persist much longer. So I agree it's safer to free them explicitly. In the attached updated patch, ReplicationSlotValidateName() now pfrees err_msg and err_hint when needed. Regards, -- Fujii Masao
Вложения
On Fri, Sep 19, 2025 at 12:00 PM Fujii Masao <masao.fujii@gmail.com> wrote: > > So I agree it's safer to free them explicitly. In the attached updated patch, > ReplicationSlotValidateName() now pfrees err_msg and err_hint when needed. > +error: + if (elevel == 0) + { + GUC_check_errdetail("%s", err_msg); + if (err_hint != NULL) + GUC_check_errhint("%s", err_hint); I see that other places use GUC_check_errcode. See check_synchronous_standby_names. So, shouldn't we use it here as well? I don't see any other place distinguishing GUC related errors in this way. It seems the other way to differentiate throwing errors for GUC related messages is used in call_string_check_hook and friends. It may not be used as it is but it can give some ideas to explore. I have not explored in detail so it may not be relevant here but it is worth checking once. -- With Regards, Amit Kapila.
Hello, > /* > - * Check whether the passed slot name is valid and report errors at elevel. > + * Check whether the passed slot name is valid and report errors. > + * > + * When called from a GUC check hook, elevel must be 0. In that case, > + * errors are reported as additional details or hints using > + * GUC_check_errdetail() and GUC_check_errhint(). Otherwise, elevel > + * must be nonzero, and errors are reported at that log level with ereport(). I find this an odd calling convention; I don't think we use it anywhere else and I wonder if we shouldn't split this up in two functions calling a third one that returns the string messages, to avoid the oddity. I'd do something like bool ReplicationSlotValidateName(const char *name, bool allow_reserved_name, char **err_msg, char **err_hint) and then if this returns false, err_msg and err_hint have been populated and can be used to throw the error or added as GUC_check_* arguments, or just ignored. > +error: > + if (elevel == 0) > + { > + GUC_check_errdetail("%s", err_msg); > + if (err_hint != NULL) > + GUC_check_errhint("%s", err_hint); > + } > + else > + ereport(elevel, > + (errcode(err_code), > + errmsg("%s", err_msg), > + (err_hint != NULL) ? errhint("%s", err_hint) : 0)); Please note that since you're using already translated strings as arguments, you must use errmsg_internal() and errhint_internal(), to avoid doubly-translating these messages. > + pfree(err_msg); > + if (err_hint != NULL) > + pfree(err_hint); > + > + return false; Not sure how valuable pfree'ing these really is ... I suspect this is only called in contexts that aren't sensitive to a few bytes of transient leakage. -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ "El hombre nunca sabe de lo que es capaz hasta que lo intenta" (C. Dickens)
On Wed, Sep 24, 2025 at 12:11 AM Álvaro Herrera <alvherre@kurilemu.de> wrote: > > Hello, > > > > /* > > - * Check whether the passed slot name is valid and report errors at elevel. > > + * Check whether the passed slot name is valid and report errors. > > + * > > + * When called from a GUC check hook, elevel must be 0. In that case, > > + * errors are reported as additional details or hints using > > + * GUC_check_errdetail() and GUC_check_errhint(). Otherwise, elevel > > + * must be nonzero, and errors are reported at that log level with ereport(). > > I find this an odd calling convention; I don't think we use it anywhere > else and I wonder if we shouldn't split this up in two functions calling > a third one that returns the string messages, to avoid the oddity. I'd > do something like > > bool > ReplicationSlotValidateName(const char *name, bool allow_reserved_name, > char **err_msg, char **err_hint) > > and then if this returns false, err_msg and err_hint have been populated > and can be used to throw the error or added as GUC_check_* arguments, or > just ignored. Yeah, that's an option. If we go with it, we'd probably need to return the error code as well. Since it looks better, I'll consider updating the patch accordingly. BTW, about validating primary_slot_name in the check hook: I'm not sure it's really worthwhile. Even without this validation, an invalid slot name will raise an error anyway, and this validation can't catch all invalid cases. So its value seems limited. If the check hook doesn't need to do this validation, then ReplicationSlotValidateName() doesn't need to be updated for that purpose. > > +error: > > + if (elevel == 0) > > + { > > + GUC_check_errdetail("%s", err_msg); > > + if (err_hint != NULL) > > + GUC_check_errhint("%s", err_hint); > > + } > > + else > > + ereport(elevel, > > + (errcode(err_code), > > + errmsg("%s", err_msg), > > + (err_hint != NULL) ? errhint("%s", err_hint) : 0)); > > Please note that since you're using already translated strings as > arguments, you must use errmsg_internal() and errhint_internal(), to > avoid doubly-translating these messages. Right, thanks for pointing that out. > > + pfree(err_msg); > > + if (err_hint != NULL) > > + pfree(err_hint); > > + > > + return false; > > Not sure how valuable pfree'ing these really is ... I suspect this is > only called in contexts that aren't sensitive to a few bytes of > transient leakage. I initially thought the same since the short-lived config file processing context is used when handling the config file. But when ReplicationSlotValidateName() is called outside the GUC check hook, seems allocations can end up in TopMemoryContext. Even though the memory use is small, it's safer to call pfree(), and the overhead is negligible. So I didn't see a strong reason to skip it.... Regards, -- Fujii Masao