Обсуждение: Re: Inconsistent LSN format in pg_waldump output

Поиск
Список
Период
Сортировка

Re: Inconsistent LSN format in pg_waldump output

От
Álvaro Herrera
Дата:
On 2025-Jul-01, Japin Li wrote:

> This inconsistency, while minor, could be confusing when cross-referencing
> LSNs within pg_waldump's own output or when parsing it programmatically.

I agree that we should fix this, but I'd rather add the missing zeros
than remove these ones (the only ones we have):

>      XLogRecGetLen(record, &rec_len, &fpi_len);
>  
> -    printf("rmgr: %-11s len (rec/tot): %6u/%6u, tx: %10u, lsn: %X/%08X, prev %X/%08X, ",
> +    printf("rmgr: %-11s len (rec/tot): %6u/%6u, tx: %10u, lsn: %X/%X, prev %X/%X, ",
>             desc->rm_name,
>             rec_len, XLogRecGetTotalLen(record),
>             XLogRecGetXid(record),

I think pg_waldump did things right in this regard, and all other places
were cargo-culting the older broken practice.

IOW I think we should change all occurrences of %X/%X to %X/%08X
instead.  There's a ton of them though.  See also
https://www.postgresql.org/message-id/flat/CAExHW5ub5NaTELZ3hJUCE6amuvqAtsSxc7O%2BuK7y4t9Rrk23cw%40mail.gmail.com
where LSN_FORMAT_ARGS was invented, but where the width of the second %X
was not discussed.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/



Re: Inconsistent LSN format in pg_waldump output

От
Japin Li
Дата:
On Tue, 01 Jul 2025 at 13:39, Álvaro Herrera <alvherre@kurilemu.de> wrote:
> On 2025-Jul-01, Japin Li wrote:
>
>> This inconsistency, while minor, could be confusing when cross-referencing
>> LSNs within pg_waldump's own output or when parsing it programmatically.
>
> I agree that we should fix this, but I'd rather add the missing zeros
> than remove these ones (the only ones we have):
>
>>      XLogRecGetLen(record, &rec_len, &fpi_len);
>>
>> -    printf("rmgr: %-11s len (rec/tot): %6u/%6u, tx: %10u, lsn: %X/%08X, prev %X/%08X, ",
>> +    printf("rmgr: %-11s len (rec/tot): %6u/%6u, tx: %10u, lsn: %X/%X, prev %X/%X, ",
>>             desc->rm_name,
>>             rec_len, XLogRecGetTotalLen(record),
>>             XLogRecGetXid(record),
>
> I think pg_waldump did things right in this regard, and all other places
> were cargo-culting the older broken practice.
>

I initially considered using the %X/%08X format, but observing %X/%X
consistently elsewhere led me to abandon that idea.

> IOW I think we should change all occurrences of %X/%X to %X/%08X
> instead.  There's a ton of them though.  See also
> https://www.postgresql.org/message-id/flat/CAExHW5ub5NaTELZ3hJUCE6amuvqAtsSxc7O%2BuK7y4t9Rrk23cw%40mail.gmail.com
> where LSN_FORMAT_ARGS was invented, but where the width of the second %X
> was not discussed.

Agreed.  I believe %X/%08X is better.
--
Regards,
Japin Li



Re: Inconsistent LSN format in pg_waldump output

От
Masahiko Sawada
Дата:
On Wed, Jul 2, 2025 at 10:56 PM Japin Li <japinli@hotmail.com> wrote:
>
> On Tue, 01 Jul 2025 at 22:00, Japin Li <japinli@hotmail.com> wrote:
> > On Tue, 01 Jul 2025 at 13:39, Álvaro Herrera <alvherre@kurilemu.de> wrote:
> >> On 2025-Jul-01, Japin Li wrote:
> >>
> >>> This inconsistency, while minor, could be confusing when cross-referencing
> >>> LSNs within pg_waldump's own output or when parsing it programmatically.
> >>
> >> I agree that we should fix this, but I'd rather add the missing zeros
> >> than remove these ones (the only ones we have):
> >>
> >>>     XLogRecGetLen(record, &rec_len, &fpi_len);
> >>>
> >>> -   printf("rmgr: %-11s len (rec/tot): %6u/%6u, tx: %10u, lsn: %X/%08X, prev %X/%08X, ",
> >>> +   printf("rmgr: %-11s len (rec/tot): %6u/%6u, tx: %10u, lsn: %X/%X, prev %X/%X, ",
> >>>                desc->rm_name,
> >>>                rec_len, XLogRecGetTotalLen(record),
> >>>                XLogRecGetXid(record),
> >>
> >> I think pg_waldump did things right in this regard, and all other places
> >> were cargo-culting the older broken practice.
> >>
> >
> > I initially considered using the %X/%08X format, but observing %X/%X
> > consistently elsewhere led me to abandon that idea.
> >
> >> IOW I think we should change all occurrences of %X/%X to %X/%08X
> >> instead.  There's a ton of them though.  See also
> >> https://www.postgresql.org/message-id/flat/CAExHW5ub5NaTELZ3hJUCE6amuvqAtsSxc7O%2BuK7y4t9Rrk23cw%40mail.gmail.com
> >> where LSN_FORMAT_ARGS was invented, but where the width of the second %X
> >> was not discussed.

Interesting. While this is a better format, could it break
compatibility with existing tools that for example compares LSN
strings?

> >
> > Agreed.  I believe %X/%08X is better.
>
> Patch to standardize LSN formatting with zero-padding.

Thank you for updating the patch. I think this patch doesn't need to
update .po files as we do that at once when doing the translation
update.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: Inconsistent LSN format in pg_waldump output

От
Álvaro Herrera
Дата:
On 2025-Jul-03, Masahiko Sawada wrote:

> On Wed, Jul 2, 2025 at 10:56 PM Japin Li <japinli@hotmail.com> wrote:

> Interesting. While this is a better format, could it break
> compatibility with existing tools that for example compares LSN
> strings?

I think a tool would have to be severely miswritten in order to become
broken from this change.  Our own code to scan LSNs is to use
scanf('%X/%X') which should work just fine with and without the leading
zeroes.  I honestly don't see anybody coding this in any different way
that could not cope with the leading zeroes :-)

> > > Agreed.  I believe %X/%08X is better.
> >
> > Patch to standardize LSN formatting with zero-padding.
> 
> Thank you for updating the patch. I think this patch doesn't need to
> update .po files as we do that at once when doing the translation
> update.

Agreed.  In fact these updates are probably harmful (they certainly
bloat the patch quite a bit).  These files come from a different
repository, which you're welcome to provide a patch for, after the code
change lands in Postgres.
https://git.postgresql.org/cgit/pgtranslation/messages.git/

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Porque Kim no hacía nada, pero, eso sí,
con extraordinario éxito" ("Kim", Kipling)



Re: Inconsistent LSN format in pg_waldump output

От
Michael Paquier
Дата:
On Wed, Jul 02, 2025 at 08:57:45PM +0200, Alvaro Herrera wrote:
> I think a tool would have to be severely miswritten in order to become
> broken from this change.  Our own code to scan LSNs is to use
> scanf('%X/%X') which should work just fine with and without the leading
> zeroes.  I honestly don't see anybody coding this in any different way
> that could not cope with the leading zeroes :-)

Yep.  If you do not want this new policy to be forgotten by new paths,
I'd suggested to standarize that with something like that, close to
the existing LSN_FORMAT_ARGS():
#define LSN_FORMAT "%X/%08X"

I was pretty sure that this point was mentioned when LSN_FORMAT_ARGS
got discussed, but my impression is wrong as Alvaro already said
upthread.  I'd suggest to take this extra step now instead of
hardcoding %08X everywhere.  We may perhaps go down to backpatch
LSN_FORMAT_ARGS() and this LSN_FORMAT in the back-branches, leading to
less friction when backpatching fixes, but we don't deal with many
stable fixes that would require these.
--
Michael

Вложения

Re: Inconsistent LSN format in pg_waldump output

От
Masahiko Sawada
Дата:
On Thu, Jul 3, 2025 at 7:32 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Wed, Jul 02, 2025 at 08:57:45PM +0200, Alvaro Herrera wrote:
> > I think a tool would have to be severely miswritten in order to become
> > broken from this change.  Our own code to scan LSNs is to use
> > scanf('%X/%X') which should work just fine with and without the leading
> > zeroes.  I honestly don't see anybody coding this in any different way
> > that could not cope with the leading zeroes :-)

I had concerns regarding scenarios where users or tools
programmatically search for LSNs or perform string comparisons without
proper parsing, as initially raised by Japin Li. For instance, if a
newer version of the server records an LSN as '0/0AB10228' in the
server log, users attempting to search for this LSN won't find matches
in logs generated by older server versions. While such differences
might be acceptable across major version upgrades, I'm uncertain
whether this inconsistency is appropriate between minor versions.

> Yep.  If you do not want this new policy to be forgotten by new paths,
> I'd suggested to standarize that with something like that, close to
> the existing LSN_FORMAT_ARGS():
> #define LSN_FORMAT "%X/%08X"

+1

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com



Re: Inconsistent LSN format in pg_waldump output

От
Álvaro Herrera
Дата:
On 2025-Jul-03, Michael Paquier wrote:

> Yep.  If you do not want this new policy to be forgotten by new paths,
> I'd suggested to standarize that with something like that, close to
> the existing LSN_FORMAT_ARGS():
> #define LSN_FORMAT "%X/%08X"

Well, the reason we didn't use a macro in the format string is that
translatability suffers quite a bit, and it's quite annoying.  You can
still use it in strings that aren't going to be translated -- for
instance in all the xlog *_desc routines, in elog(), errmsg_internal(),
errdetail_internal().  But for translatable messages such as errmsg(),
errdetail, it's going to break.  For example, one particular message in
twophase.c was originally

msgid "could not read two-phase state from WAL at %X/%X"

after this patch, it becomes

msgid "could not read two-phase state from WAL at "

which is obviously broken.  (You can test this by running "make
update-po" and looking at the src/backend/po/*.po.new files.  No idea
hwo to do this under Meson, it doesn't seem documented.)

Eyeballing the patch I think a majority of the messages are not
translatable, so I'm still okay with adding and using the macro.  But we
need a revision to go back to literal %X/%08X in errmsg(), errdetail(),
report_invalid_record().  I'd also add a comment next to the macro
indicating that the macro MUST NOT be used for translatable strings, as
it otherwise results in a bug that's only visible if you're running a
version in a language other than English.  I bet we're still going to
get hackers use it badly, but not often.

The GNU gettext manual suggests you can print the value to a string
variable and then use %s to include that in the translatable string, but
I doubt that's an improvement over just using %X/%08X directly.
Bottom of this page here:
https://www.gnu.org/software/gettext/manual/html_node/No-string-concatenation.html

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/



Re: Inconsistent LSN format in pg_waldump output

От
Japin Li
Дата:
On Thu, 03 Jul 2025 at 10:19, Álvaro Herrera <alvherre@kurilemu.de> wrote:
> On 2025-Jul-03, Michael Paquier wrote:
>
>> Yep.  If you do not want this new policy to be forgotten by new paths,
>> I'd suggested to standarize that with something like that, close to
>> the existing LSN_FORMAT_ARGS():
>> #define LSN_FORMAT "%X/%08X"
>
> Well, the reason we didn't use a macro in the format string is that
> translatability suffers quite a bit, and it's quite annoying.  You can
> still use it in strings that aren't going to be translated -- for
> instance in all the xlog *_desc routines, in elog(), errmsg_internal(),
> errdetail_internal().  But for translatable messages such as errmsg(),
> errdetail, it's going to break.  For example, one particular message in
> twophase.c was originally
>
> msgid "could not read two-phase state from WAL at %X/%X"
>
> after this patch, it becomes
>
> msgid "could not read two-phase state from WAL at "
>
> which is obviously broken.  (You can test this by running "make
> update-po" and looking at the src/backend/po/*.po.new files.  No idea
> hwo to do this under Meson, it doesn't seem documented.)
>

You're right; I overlooked it.

> Eyeballing the patch I think a majority of the messages are not
> translatable, so I'm still okay with adding and using the macro.  But we
> need a revision to go back to literal %X/%08X in errmsg(), errdetail(),
> report_invalid_record().  I'd also add a comment next to the macro
> indicating that the macro MUST NOT be used for translatable strings, as
> it otherwise results in a bug that's only visible if you're running a
> version in a language other than English.  I bet we're still going to
> get hackers use it badly, but not often.
>

Providing two LSN formats — %X%08X for translatable messages and
LSN_FORMAT for non-translatable ones — seems to offer no clear advantage.

I'd prefer to use %X/%08X directly and add the description to the
LSN_FORMAT_ARGS macro.

> The GNU gettext manual suggests you can print the value to a string
> variable and then use %s to include that in the translatable string, but
> I doubt that's an improvement over just using %X/%08X directly.
> Bottom of this page here:
> https://www.gnu.org/software/gettext/manual/html_node/No-string-concatenation.html
>

Yes, I don't consider that an improvement.

--
Regards,
Japin Li



Re: Inconsistent LSN format in pg_waldump output

От
Álvaro Herrera
Дата:
On 2025-Jul-03, Japin Li wrote:

> Providing two LSN formats — %X%08X for translatable messages and
> LSN_FORMAT for non-translatable ones — seems to offer no clear advantage.
> 
> I'd prefer to use %X/%08X directly and add the description to the
> LSN_FORMAT_ARGS macro.

WFM.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"La gente vulgar sólo piensa en pasar el tiempo;
el que tiene talento, en aprovecharlo"