Обсуждение: pg_createsubscriber --dry-run logging concerns

Поиск
Список
Период
Сортировка

pg_createsubscriber --dry-run logging concerns

От
Peter Smith
Дата:
Lately, I've been reviewing some pg_createsubscriber patches and have
been tricked by some of the logging.

The pg_createsubscriber has a '--dry-run' option/mode [1]

----------
--dry-run
Do everything except actually modifying the target directory.
----------

I've noticed that the logging in '--dry-run' mode is indistinguishable
from the logging of "normal" run, although functions like
create_publication(), drop_publication(), etc, are NOP in '--dry-run'
mode, because the actual command execution code is skipped. e.g.

if (!dry_run)
{
  res = PQexec(conn, str->data);
  ...
}

~~~

IMO, it's not good to fool people into thinking something has happened
when in fact nothing happened at all. I think the logging of this tool
should be much more explicit wrt when it is/isn't in dry-run mode.
Perhaps like this:

NORMAL
pg_log_info("creating publication \"%s\" in database \"%s\"", ...)

DRY-RUN
pg_log_info("[dry-run] would create publication \"%s\" in database \"%s\"", ...)

~~~

Thoughts?

======
[1] https://www.postgresql.org/docs/current/app-pgcreatesubscriber.html

Kind Regards,
Peter Smith.
Fujitsu Australia



RE: pg_createsubscriber --dry-run logging concerns

От
"Hayato Kuroda (Fujitsu)"
Дата:
Dear Peter,

> IMO, it's not good to fool people into thinking something has happened
> when in fact nothing happened at all. I think the logging of this tool
> should be much more explicit wrt when it is/isn't in dry-run mode.
> Perhaps like this:
> 
> NORMAL
> pg_log_info("creating publication \"%s\" in database \"%s\"", ...)
> 
> DRY-RUN
> pg_log_info("[dry-run] would create publication \"%s\" in database \"%s\"", ...)

Per my understanding, almost all the output must be adjusted based on the mode, right?
I feel it introduces a burden.
Can we solve the issue if we print additional message at the beginning if the
command runs with dry-run mode?

Best regards,
Hayato Kuroda
FUJITSU LIMITED 


Re: pg_createsubscriber --dry-run logging concerns

От
Peter Smith
Дата:
On Wed, Oct 1, 2025 at 1:09 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
>
> Dear Peter,
>
> > IMO, it's not good to fool people into thinking something has happened
> > when in fact nothing happened at all. I think the logging of this tool
> > should be much more explicit wrt when it is/isn't in dry-run mode.
> > Perhaps like this:
> >
> > NORMAL
> > pg_log_info("creating publication \"%s\" in database \"%s\"", ...)
> >
> > DRY-RUN
> > pg_log_info("[dry-run] would create publication \"%s\" in database \"%s\"", ...)
>
> Per my understanding, almost all the output must be adjusted based on the mode, right?
> I feel it introduces a burden.
> Can we solve the issue if we print additional message at the beginning if the
> command runs with dry-run mode?
>

Hi Kuroda-san,

Yes, that is one way. Something is better than nothing, at least...

I think that not *everything* in dry mode is fake; some of the logged
operations are real. So, it might be good if we can show fake ones
differently. e.g. It may not take much effort to introduce a wrapper
that inserts a prefix. Use this as a replacement for just a few
specific info logs.

(code below may not work; it's just for illustrative purposes)

#define pg_log_info_checkdry(...) do {\
  if (dry_run)\
    pg_log_generic(PG_LOG_INFO, PG_LOG_PRIMARY, "[dry-run NOP]" __VA_ARGS__);\
  else;\
    pg_log_generic(PG_LOG_INFO, PG_LOG_PRIMARY, __VA_ARGS__);\
  } while (0);

======
Kind Regards,
Peter Smith.
Fujitsu Australia



Re: pg_createsubscriber --dry-run logging concerns

От
Álvaro Herrera
Дата:
On 2025-Oct-01, Peter Smith wrote:

> (code below may not work; it's just for illustrative purposes)
> 
> #define pg_log_info_checkdry(...) do {\
>   if (dry_run)\
>     pg_log_generic(PG_LOG_INFO, PG_LOG_PRIMARY, "[dry-run NOP]" __VA_ARGS__);\
>   else;\
>     pg_log_generic(PG_LOG_INFO, PG_LOG_PRIMARY, __VA_ARGS__);\
>   } while (0);

I like this kind of idea best.  However I think it might be better to do
it the other way around: have the normal pg_log_info() check dry_run,
and have a special one for the messages that are to be identical in
either mode.  I'm not sure how difficult this is to implement, though.

pg_subscriber is not the only program with a dry-run mode; it looks like
pg_archiveclean, pg_combinebackup, pg_resetwal, pg_rewind have one.  Is
it worth maybe doing something at the common/logging.c level rather than
specifically pg_createsubscriber?

-- 
Álvaro Herrera        Breisgau, Deutschland  —  https://www.EnterpriseDB.com/
[…] indem ich in meinem Leben oft an euch gedacht, euch glücklich zu machen. Seyd es!
A menudo he pensado en vosotros, en haceros felices. ¡Sedlo, pues!
        Heiligenstädter Testament, L. v. Beethoven, 1802
        https://de.wikisource.org/wiki/Heiligenstädter_Testament



Re: pg_createsubscriber --dry-run logging concerns

От
Peter Smith
Дата:
On Wed, Oct 1, 2025 at 8:37 PM Álvaro Herrera <alvherre@kurilemu.de> wrote:
>
> On 2025-Oct-01, Peter Smith wrote:
>
> > (code below may not work; it's just for illustrative purposes)
> >
> > #define pg_log_info_checkdry(...) do {\
> >   if (dry_run)\
> >     pg_log_generic(PG_LOG_INFO, PG_LOG_PRIMARY, "[dry-run NOP]" __VA_ARGS__);\
> >   else;\
> >     pg_log_generic(PG_LOG_INFO, PG_LOG_PRIMARY, __VA_ARGS__);\
> >   } while (0);
>
> I like this kind of idea best.  However I think it might be better to do
> it the other way around: have the normal pg_log_info() check dry_run,
> and have a special one for the messages that are to be identical in
> either mode.  I'm not sure how difficult this is to implement, though.
>
> pg_subscriber is not the only program with a dry-run mode; it looks like
> pg_archiveclean, pg_combinebackup, pg_resetwal, pg_rewind have one.  Is
> it worth maybe doing something at the common/logging.c level rather than
> specifically pg_createsubscriber?
>

Hi Alvaro.

Thanks for the feedback and for pointing out the other tools that also
have a dr-run mode.

I did some quick back-of-the-envelope calculations to see what might
be involved.

======

For pg_create_subscriber, there are 38 info logs. Of those 38, I'd be
tempted to change only logs that say they are modifying/executing
something. There's ~15 of those:

pg_log_info("modifying system identifier of subscriber");
pg_log_info("running pg_resetwal on the subscriber");
pg_log_info("subscriber successfully changed the system identifier");
pg_log_info("use existing publication \"%s\" in database \"%s\"",
pg_log_info("create publication \"%s\" in database \"%s\"",
pg_log_info("create replication slot \"%s\" on publisher",
pg_log_info("dropping subscription \"%s\" in database \"%s\"",
pg_log_info("creating the replication slot \"%s\" in database \"%s\"",
pg_log_info("dropping the replication slot \"%s\" in database \"%s\"",
pg_log_info("creating publication \"%s\" in database \"%s\"",
pg_log_info("dropping publication \"%s\" in database \"%s\"",
pg_log_info("dropping all existing publications in database \"%s\"",
pg_log_info("creating subscription \"%s\" in database \"%s\"",
pg_log_info("setting the replication progress (node name \"%s\", LSN
%s) in database \"%s\"",
pg_log_info("enabling subscription \"%s\" in database \"%s\"",

======

For pg_archiveclean:

There only seems to be one logging for this dry run. Currently, it is
using debug logging like:

if (dry_run)
  pg_log_debug("would do XXX")
else
  pg_log_debug("doing XXX");

======

For pg_combinebackup:

This is the same as above; the differences for dry run logging are
already handled by having separate messages, and it uses debug logging
in many places with this pattern:

if (dry_run)
pg_log_debug("would do XXX")
else
    pg_log_debug("doing XXX");
======

For pg_resetwal:

This one is using a 'noupdate' flag instead of a 'dry_run' boolean.
And there doesn't seem to be any logging for this.

======

For pg_rewind:

Has 13 logs.  I'd be tempted to change maybe only a couple of these:

pg_rewind/pg_rewind.c: pg_log_info("creating backup label and updating
control file");
pg_rewind/pg_rewind.c: pg_log_info("executing \"%s\" for target server
to complete crash recovery",

======

I also found dry-run logs using the "would do XXX" pattern elsewhere,
in code like contrib/vacuumIo.c

//////

Summary

The idea to change the pg_log_info macro globally seems risky. There
are 400+ usages of this in the PG code, way beyond the scope of these
few tools that have a dry-run.

It seems that all the other --dry-run capable tools are using the pattern
if (dry_run)
pg_log_debug("would do XXX")
else
    pg_log_debug("doing XXX");

So, that makes pg_createsubscriber the odd man out. Instead of
introducing a new logging macro, perhaps it's better (for code
consistency) just to change pg_createsubscriber to use that same
logging pattern.

~~~

Kurdoa-san [1] was concerned it might be a big change burden to change
the pg_createsubscriber logs, but AFAICT there are only ~15/38 places
that I'd be tempted to change in pg_createsubscriber.c; that's not
really such a burden.

OTOH, I do think Kuroda-san's idea [1] of giving some up-front logging
to indicate --dry-run might be useful. Even when run-time gives
different log messages, I think those other tools only show
differences when using DEBUG logging. Anybody not debugging might
otherwise have no idea that nothing actually happened.

~~

So, below is now my favoured solution:

1. Add an up-front info log to say when running in dry-run (add for
all tools that have --dry-run mode)

2. Modify ~15 places in pg_createsubscriber.c to use the code pattern
consistent with all the other tools.
if (dry_run)
  pg_log_info("would do XXX")
else
  pg_log_info("doing XXX");

Thoughts?

======
[1]
https://www.postgresql.org/message-id/OSCPR01MB149668DD062C1457FFAA44A28F5E6A%40OSCPR01MB14966.jpnprd01.prod.outlook.com

Kind Regards,
Peter Smith.
Fujitsu Australia