Обсуждение: A new message seems missing a punctuation

Поиск
Список
Период
Сортировка

A new message seems missing a punctuation

От
Kyotaro Horiguchi
Дата:
A recent commit (7a424ece48) added the following message:

> could not sync slot information as remote slot precedes local slot:
> remote slot "%s": LSN (%X/%X), catalog xmin (%u) local slot: LSN
> (%X/%X), catalog xmin (%u)

Since it is a bit overloaded but doesn't have a separator between
"catalog xmin (%u)" and "local slot:", it is somewhat cluttered. Don't
we need something, for example a semicolon there to improve
readability and reduce clutter?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



Re: A new message seems missing a punctuation

От
Robert Haas
Дата:
On Mon, Feb 19, 2024 at 10:10 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
> A recent commit (7a424ece48) added the following message:
>
> > could not sync slot information as remote slot precedes local slot:
> > remote slot "%s": LSN (%X/%X), catalog xmin (%u) local slot: LSN
> > (%X/%X), catalog xmin (%u)
>
> Since it is a bit overloaded but doesn't have a separator between
> "catalog xmin (%u)" and "local slot:", it is somewhat cluttered. Don't
> we need something, for example a semicolon there to improve
> readability and reduce clutter?

I think maybe we should even revise the message even more. In most
places we do not just print out a whole bunch of values, but use a
sentence to connect them.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: A new message seems missing a punctuation

От
shveta malik
Дата:
On Mon, Feb 19, 2024 at 10:31 AM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Mon, Feb 19, 2024 at 10:10 AM Kyotaro Horiguchi
> <horikyota.ntt@gmail.com> wrote:
> > A recent commit (7a424ece48) added the following message:
> >
> > > could not sync slot information as remote slot precedes local slot:
> > > remote slot "%s": LSN (%X/%X), catalog xmin (%u) local slot: LSN
> > > (%X/%X), catalog xmin (%u)
> >
> > Since it is a bit overloaded but doesn't have a separator between
> > "catalog xmin (%u)" and "local slot:", it is somewhat cluttered. Don't
> > we need something, for example a semicolon there to improve
> > readability and reduce clutter?
>
> I think maybe we should even revise the message even more. In most
> places we do not just print out a whole bunch of values, but use a
> sentence to connect them.

I have tried to reword the msg, please have a look.

thanks
Shveta

Вложения

Re: A new message seems missing a punctuation

От
Kyotaro Horiguchi
Дата:
At Mon, 19 Feb 2024 10:31:33 +0530, Robert Haas <robertmhaas@gmail.com> wrote in 
> On Mon, Feb 19, 2024 at 10:10 AM Kyotaro Horiguchi
> <horikyota.ntt@gmail.com> wrote:
> > A recent commit (7a424ece48) added the following message:
> >
> > > could not sync slot information as remote slot precedes local slot:
> > > remote slot "%s": LSN (%X/%X), catalog xmin (%u) local slot: LSN
> > > (%X/%X), catalog xmin (%u)
> >
> > Since it is a bit overloaded but doesn't have a separator between
> > "catalog xmin (%u)" and "local slot:", it is somewhat cluttered. Don't
> > we need something, for example a semicolon there to improve
> > readability and reduce clutter?
> 
> I think maybe we should even revise the message even more. In most
> places we do not just print out a whole bunch of values, but use a
> sentence to connect them.

Mmm. Something like this?:

"could not sync slot information: local slot LSN (%X/%X) or xmin(%u)
 behind remote (%X/%X, %u)"

Or I thought the values could be moved to DETAILS: line.

By the way, the code around the message is as follows.

> /*
>  * The remote slot didn't catch up to locally reserved position.
>  *
>  * We do not drop the slot because the restart_lsn can be ahead of the
>  * current location when recreating the slot in the next cycle. It may
>  * take more time to create such a slot. Therefore, we keep this slot
>  * and attempt the synchronization in the next cycle.
>  *
>  * XXX should this be changed to elog(DEBUG1) perhaps?
>  */
> ereport(LOG,
>         errmsg("could not sync slot information as remote slot precedes local slot:"
>                       " remote slot \"%s\": LSN (%X/%X), catalog xmin (%u) local slot: LSN (%X/%X), catalog xmin
(%u)",

If we change this message to DEBUG1, a timeout feature to show a
WARNING message might be needed for DBAs to notice that something bad
is ongoing. However, it doesn't seem appropriate as a LOG message to
me. Perhaps, a LOG message should be more like:

> "LOG:  waiting for local slot to catch up to remote"
> "DETAIL:  remote slot LSN (%X/%X)....; "

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: A new message seems missing a punctuation

От
Robert Haas
Дата:
On Mon, Feb 19, 2024 at 11:04 AM Kyotaro Horiguchi
<horikyota.ntt@gmail.com> wrote:
> Or I thought the values could be moved to DETAILS: line.

Yeah, I think that's likely to be the right approach here. The details
aren't too clear to me.

Does the primary error message really need to say "could not sync
slot"? If it will be obvious from context that we were trying to sync
a slot, then it would be fine to just say "ERROR: remote slot precedes
local slot".

But I also don't quite understand what problem this is trying to
report. Is this slot-syncing code running on the primary or the
standby? If it's running on the primary, then surely it's expected
that the remote slot will precede the local one. And if it's running
on the standby, then the comments in
update_and_persist_local_synced_slot about waiting for the remote side
to catch up seem quite confusing, because surely we're chasing the
primary and not the other way around?

But if we ignore all of that, then we could just do this:

ERROR: could not sync slot information as remote slot precedes local slot
DETAIL: Remote slot has LSN %X/%X and catalog xmin %u, but remote slot
has LSN %X/%X and catalog xmin %u.

which would fix the original complaint, and my point about using
English rather than just printing a bunch of values.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: A new message seems missing a punctuation

От
Amit Kapila
Дата:
On Mon, Feb 19, 2024 at 11:42 AM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Mon, Feb 19, 2024 at 11:04 AM Kyotaro Horiguchi
> <horikyota.ntt@gmail.com> wrote:
> > Or I thought the values could be moved to DETAILS: line.
>
> Yeah, I think that's likely to be the right approach here. The details
> aren't too clear to me.
>
> Does the primary error message really need to say "could not sync
> slot"? If it will be obvious from context that we were trying to sync
> a slot, then it would be fine to just say "ERROR: remote slot precedes
> local slot".
>

As this is a LOG message, I feel one may need some more information on
the context but it is not mandatory.

> But I also don't quite understand what problem this is trying to
> report. Is this slot-syncing code running on the primary or the
> standby? If it's running on the primary, then surely it's expected
> that the remote slot will precede the local one. And if it's running
> on the standby, then the comments in
> update_and_persist_local_synced_slot about waiting for the remote side
> to catch up seem quite confusing, because surely we're chasing the
> primary and not the other way around?
>

The local's restart_lsn could be ahead of than primary's for the very
first sync when the WAL corresponding to the remote's restart_lsn is
not available on standby (say due to a different wal related settings
the required WAL has been removed when we first time tried to sync the
slot). For more details, you can refer to comments atop slotsync.c
starting from "If the WAL corresponding to the remote's restart_lsn
..."

> But if we ignore all of that, then we could just do this:
>
> ERROR: could not sync slot information as remote slot precedes local slot
> DETAIL: Remote slot has LSN %X/%X and catalog xmin %u, but remote slot
> has LSN %X/%X and catalog xmin %u.
>

This looks good to me but instead of ERROR here we want to use LOG.

--
With Regards,
Amit Kapila.



Re: A new message seems missing a punctuation

От
Amit Kapila
Дата:
On Mon, Feb 19, 2024 at 12:14 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Mon, Feb 19, 2024 at 11:42 AM Robert Haas <robertmhaas@gmail.com> wrote:
> >
> > On Mon, Feb 19, 2024 at 11:04 AM Kyotaro Horiguchi
> > <horikyota.ntt@gmail.com> wrote:
> > > Or I thought the values could be moved to DETAILS: line.
> >
> > Yeah, I think that's likely to be the right approach here. The details
> > aren't too clear to me.
> >
> > Does the primary error message really need to say "could not sync
> > slot"? If it will be obvious from context that we were trying to sync
> > a slot, then it would be fine to just say "ERROR: remote slot precedes
> > local slot".
> >
>
> As this is a LOG message, I feel one may need some more information on
> the context but it is not mandatory.
>
> > But I also don't quite understand what problem this is trying to
> > report. Is this slot-syncing code running on the primary or the
> > standby? If it's running on the primary, then surely it's expected
> > that the remote slot will precede the local one. And if it's running
> > on the standby, then the comments in
> > update_and_persist_local_synced_slot about waiting for the remote side
> > to catch up seem quite confusing, because surely we're chasing the
> > primary and not the other way around?
> >
>
> The local's restart_lsn could be ahead of than primary's for the very
> first sync when the WAL corresponding to the remote's restart_lsn is
> not available on standby (say due to a different wal related settings
> the required WAL has been removed when we first time tried to sync the
> slot). For more details, you can refer to comments atop slotsync.c
> starting from "If the WAL corresponding to the remote's restart_lsn
> ..."
>

Sorry, I gave the wrong reference, the comments I was referring to
start with: "If on physical standby, the WAL corresponding ...".

--
With Regards,
Amit Kapila.



Re: A new message seems missing a punctuation

От
Robert Haas
Дата:
On Mon, Feb 19, 2024 at 12:14 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > Does the primary error message really need to say "could not sync
> > slot"? If it will be obvious from context that we were trying to sync
> > a slot, then it would be fine to just say "ERROR: remote slot precedes
> > local slot".
>
> As this is a LOG message, I feel one may need some more information on
> the context but it is not mandatory.

I'll defer to you.

> > But I also don't quite understand what problem this is trying to
> > report. Is this slot-syncing code running on the primary or the
> > standby? If it's running on the primary, then surely it's expected
> > that the remote slot will precede the local one. And if it's running
> > on the standby, then the comments in
> > update_and_persist_local_synced_slot about waiting for the remote side
> > to catch up seem quite confusing, because surely we're chasing the
> > primary and not the other way around?
>
> The local's restart_lsn could be ahead of than primary's for the very
> first sync when the WAL corresponding to the remote's restart_lsn is
> not available on standby (say due to a different wal related settings
> the required WAL has been removed when we first time tried to sync the
> slot). For more details, you can refer to comments atop slotsync.c
> starting from "If the WAL corresponding to the remote's restart_lsn
> ..."

So why do we log a message about this?

> > But if we ignore all of that, then we could just do this:
> >
> > ERROR: could not sync slot information as remote slot precedes local slot
> > DETAIL: Remote slot has LSN %X/%X and catalog xmin %u, but remote slot
> > has LSN %X/%X and catalog xmin %u.
> >
>
> This looks good to me but instead of ERROR here we want to use LOG.

Fair enough!

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: A new message seems missing a punctuation

От
Amit Kapila
Дата:
On Tue, Feb 20, 2024 at 4:21 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Mon, Feb 19, 2024 at 12:14 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> > > But I also don't quite understand what problem this is trying to
> > > report. Is this slot-syncing code running on the primary or the
> > > standby? If it's running on the primary, then surely it's expected
> > > that the remote slot will precede the local one. And if it's running
> > > on the standby, then the comments in
> > > update_and_persist_local_synced_slot about waiting for the remote side
> > > to catch up seem quite confusing, because surely we're chasing the
> > > primary and not the other way around?
> >
> > The local's restart_lsn could be ahead of than primary's for the very
> > first sync when the WAL corresponding to the remote's restart_lsn is
> > not available on standby (say due to a different wal related settings
> > the required WAL has been removed when we first time tried to sync the
> > slot). For more details, you can refer to comments atop slotsync.c
> > starting from "If the WAL corresponding to the remote's restart_lsn
> > ..."
>
> So why do we log a message about this?
>

This was added after the main commit of this functionality to find
some BF failures (where we were expecting the slot to sync but due to
one of these conditions not being met the slot was not synced) and we
can probably change it to DEBUG1 as well. I think we would need this
information w.r.t this functionality to gather more information in
case expected slots are not being synced and it may be helpful for
users to also know why the slots are not synced, if that happens.

--
With Regards,
Amit Kapila.



Re: A new message seems missing a punctuation

От
Robert Haas
Дата:
On Tue, Feb 20, 2024 at 4:42 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > So why do we log a message about this?
>
> This was added after the main commit of this functionality to find
> some BF failures (where we were expecting the slot to sync but due to
> one of these conditions not being met the slot was not synced) and we
> can probably change it to DEBUG1 as well. I think we would need this
> information w.r.t this functionality to gather more information in
> case expected slots are not being synced and it may be helpful for
> users to also know why the slots are not synced, if that happens.

Ah, OK. Do you think we need any kind of system view to provide more
insight here or is a log message sufficient?

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: A new message seems missing a punctuation

От
Amit Kapila
Дата:
On Tue, Feb 20, 2024 at 4:50 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Tue, Feb 20, 2024 at 4:42 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > So why do we log a message about this?
> >
> > This was added after the main commit of this functionality to find
> > some BF failures (where we were expecting the slot to sync but due to
> > one of these conditions not being met the slot was not synced) and we
> > can probably change it to DEBUG1 as well. I think we would need this
> > information w.r.t this functionality to gather more information in
> > case expected slots are not being synced and it may be helpful for
> > users to also know why the slots are not synced, if that happens.
>
> Ah, OK. Do you think we need any kind of system view to provide more
> insight here or is a log message sufficient?
>

We do expose the required information (restart_lsn, catalog_xmin,
synced, temporary, etc) via pg_replication_slots. So, I feel the LOG
message here is sufficient to DEBUG (or know the details) when the
slot sync doesn't succeed.

--
With Regards,
Amit Kapila.



Re: A new message seems missing a punctuation

От
shveta malik
Дата:
On Tue, Feb 20, 2024 at 5:01 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
>
> We do expose the required information (restart_lsn, catalog_xmin,
> synced, temporary, etc) via pg_replication_slots. So, I feel the LOG
> message here is sufficient to DEBUG (or know the details) when the
> slot sync doesn't succeed.
>

Please find the patch having the suggested changes.

thanks
Shveta

Вложения

Re: A new message seems missing a punctuation

От
Amit Kapila
Дата:
On Tue, Feb 20, 2024 at 7:25 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Tue, Feb 20, 2024 at 5:01 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> >
> >
> > We do expose the required information (restart_lsn, catalog_xmin,
> > synced, temporary, etc) via pg_replication_slots. So, I feel the LOG
> > message here is sufficient to DEBUG (or know the details) when the
> > slot sync doesn't succeed.
> >
>
> Please find the patch having the suggested changes.
>

LGTM. I'll push this tomorrow unless someone has further comments/suggestions.

--
With Regards,
Amit Kapila.