Обсуждение: Re: 64 bit numbers vs format strings
On 05.12.24 23:18, Thomas Munro wrote:
> Having learned some things about gettext based on clues[1] from Peter
> E, I decided to see what it would take to expunge all (long long) and
> similar casts now that we're using the standard types with system
> support.
>
> The short version is tha given uint64 x:
>
> Old: errmsg("hello %llu", (unsigned long long) x)
> New: errmsg("hello %" PRIu64, x)
I have committed the subset of this patch for pg_checksums.c so that the
translators and whoever else might be affected can try this out at small
scale. (I don't expect any particular problems.) Then we can move on
to the rest in a few weeks, I think.
On Mon, Mar 3, 2025 at 6:21 AM Peter Eisentraut <peter@eisentraut.org> wrote:
> On 05.12.24 23:18, Thomas Munro wrote:
> > Old: errmsg("hello %llu", (unsigned long long) x)
> > New: errmsg("hello %" PRIu64, x)
>
> I have committed the subset of this patch for pg_checksums.c so that the
> translators and whoever else might be affected can try this out at small
> scale. (I don't expect any particular problems.) Then we can move on
> to the rest in a few weeks, I think.
Good plan, thanks. Here's a rebase.
I also added one more patch that I expect to be more contentious as it
is a UX change. Why do we display LSNs with a slash? I believe there
are two reasons: (1) Back in 2000 didn't require the existence of a 64
bit type, so XLogRecPtr was a struct holding two uint32 values. The
author could still have used "%08X%08X" for both printf and scanf if
that was the only reason. (2) It initially had a real semantic
division into two parts log ID and log offset, which the author
apparently wanted to convey to readers. I didn't check the full
history but I think at some point our log segments (first 16MB, now
initdb-time variable) superseded the log ID concept, which I think
originally had 4GB segments? (I could also have had something to do
with the abandoned undo system's needs, IDK.) That leads to the idea
of ditching the slash and displaying them in the more obvious (to my
aesthetic, anyway, YMMV):
SELECT pg_lsn(23783416::numeric);
- pg_lsn
------------
- 0/16AE7F8
+ pg_lsn
+------------------
+ 00000000016AE7F8
And likewise wherever they appear or are parsed in tools, protocols,
command lines etc.
/me activates flame-proof force field
I realised while contemplating that that my treatment of pgoff_t might
not be quite right in the first patch. It casts off_t (eg from struct
stat) to (pgoff_t) and display as "%" PRId64, which is correct for
Windows where pgoff_t is a typedef to __int64 (actually int64_t would
be more logical, but presumably int64_t is __int64 on that system, not
sure whether that is truly a distinct type according to its native
compiler), but on non-Windows we use the system off_t whose printf
type is unknown to us, and might in theory be a different signed 64
bit type and provoke a warning from GCC/Clang printf attributes.
Perhaps we should define just pgoff_t as int64_t everywhere? There
are no warnings on any of our CI OSes so I assume that those OSes
coincidentally define off_t the same way they define int64_t. That
being the case we could just ignore it for now, but another system
using GCC/Clang printf attributes (eg illumos or the other BSDs) might
not happen to agree. Not done yet.
And one more thing like that: in a couple of places we see warnings on
macOS CI that I'd missed: when printing the result of i64abs() as
PRId64, because it happens to use labs() and it happens to define
int64_t as long long, and when printing a Datum as PRIx64, because
Datum is uintptr_t and it happens to define that as unsigned long. I
suppose we should cast to int64 in the definition of c.h's i64abs()
macro and a couple of similar things, and cast Datum to uint64 in that
one place that wants to print it out. Not done yet, so you can still
see this on macOS CI's build step.
Вложения
Thomas Munro <thomas.munro@gmail.com> writes:
> I also added one more patch that I expect to be more contentious as it
> is a UX change. Why do we display LSNs with a slash?
While there's surely little reason to do that anymore, I think the
blast radius of such a change will be vastly greater than is warranted
by aesthetics. It's not only our code that will be affected --- I'm
pretty sure there is a great deal of replication tooling out there
that this will break. Don't expect me to defend you from the
villagers with pitchforks.
regards, tom lane
On 02.03.25 22:08, Thomas Munro wrote: > And one more thing like that: in a couple of places we see warnings on > macOS CI that I'd missed: when printing the result of i64abs() as > PRId64, because it happens to use labs() and it happens to define > int64_t as long long, and when printing a Datum as PRIx64, because > Datum is uintptr_t and it happens to define that as unsigned long. I > suppose we should cast to int64 in the definition of c.h's i64abs() > macro and a couple of similar things, agreed > and cast Datum to uint64 in that > one place that wants to print it out. Since Datum is uintptr_t, it should be printed using the format PRIxPTR. Then it should work out.
On 02.03.25 22:08, Thomas Munro wrote:
> On Mon, Mar 3, 2025 at 6:21 AM Peter Eisentraut <peter@eisentraut.org> wrote:
>> On 05.12.24 23:18, Thomas Munro wrote:
>>> Old: errmsg("hello %llu", (unsigned long long) x)
>>> New: errmsg("hello %" PRIu64, x)
>>
>> I have committed the subset of this patch for pg_checksums.c so that the
>> translators and whoever else might be affected can try this out at small
>> scale. (I don't expect any particular problems.) Then we can move on
>> to the rest in a few weeks, I think.
>
> Good plan, thanks. Here's a rebase.
I think this went ok, and we can proceed here.
I looked through the v2-0001 patch in detail. Most of it is mechanical,
so no problems. I couple of issues you already mentioned:
- correct placeholder for Datum (uintptr_t)
- i64abs() definition needs return cast
- I don't think it's proper to assume that pgoff_t or off_t matches int64_t.
A few additional comments:
- In src/backend/access/transam/xlogreader.c, you change a cast that is
part of an arithmetic expression:
- ((long long) total_len) - gotlen,
+ total_len - gotlen,
Is this no longer required to keep the math correct? Both total_len and
gotlen are uint32. Maybe this was meant to convert to signed arithmetic?
- In src/backend/backup/basebackup.c, you change
-static long long int total_checksum_failures;
+static int64 total_checksum_failures;
I don't think it is required, and I don't think it should be encouraged,
to expunge all uses of long long int, or something like that. I think
you should use long long int for "I need a big counter" and int64 when
you want to control the storage layout. Similar to how you might choose
int vs. int32. So I would leave this file alone.
- In src/bin/pg_verifybackup/astreamer_verify.c, you change the
signedness of some arguments, e.g., in member_verify_header():
report_backup_error(mystreamer->context,
- "\"%s\" has size %llu in \"%s\" but size
%llu in the manifest",
+ "\"%s\" has size %" PRId64 " in \"%s\" but
size %" PRId64 " in the manifest",
The first signedness change is correct (member->size is pgoff_t), but
the second is not (m->size is uint64).
I think it might be better to keep this patch as a mechanical change and
fix up the signedness issues separately. (There are actually a few more
that were previously hidden by casts but will now show up with something
like -Wformat-signedness.)
- In src/fe_utils/print.c, there is also a format change in the second
hunk, but if we're going to do that one, we should also make the same
change in the first hunk. Also, in the first hunk, the second format
should be %zu not %zd.
- In src/test/modules/libpq_pipeline/libpq_pipeline.c, you're changing
the shift base from 1LL (signed) to UINT64_C(1) (unsigned). This
appears to be a semantic change separate from this patch? But if this
change is desired, then the signedness of the format argument should
also be adjusted.
About the subsequent pgbench patches:
v2-0002: ok
v2-0003: Again, I'm not sure insisting on int64 use is needed here, and
I don't know that the existing code is incorrect. If we don't like
using "long", we could just switch to "long long" here.
v2-0004: ok
About the LSN format patch, I'm generally sympathetic about this, and I
think I've sort of asked for a similar change some years ago, but it's
probably not worth pursuing for this release (if ever).
On Mon, Mar 10, 2025 at 10:49 PM Peter Eisentraut <peter@eisentraut.org> wrote: > On 02.03.25 22:08, Thomas Munro wrote: > > Good plan, thanks. Here's a rebase. > > I think this went ok, and we can proceed here. Cool, I'll post a new patch soon, but first a question about this bit: > - I don't think it's proper to assume that pgoff_t or off_t matches int64_t. So we should make pgoff_t a typedef for int64_t everywhere. It's a bit annoying that we have to teach everyone to remember to use PRId64 to print it, though. How reasonable would it be to add an extra filter into whatever script is used to run xgettext on our source tree? It could replace a very small number of agreed useful tokens to match some macros that we would also define in our tree, so that we could write PRI_PGOFF_T in our messages, but xgettext would see PRId64 and still emit those magic %<PRId64> tokens that GNU/NetBSD/Solaris gettext() know how to translate on the fly when loading message catalogues. I'm not sure how many other candidates there would be, not many (and LSN is obviously an attractive but thorny one...). (For those who missed the reason why: I believe xgettext only treats the <inttypes.h> macros with special portability gloves when used directly, so if you wrapped them in your own macros and did nothing else, you'd get the fully expanded macros as defined on the system that runs xgettext. if I understood correctly. Concretely, if that's a 64 bit glibc system where PRId64 is "ld", the resulting catalogues wouldn't work on a Windows or 32 bit system where sizeof(long) < sizeof(int64_t). You might be able to get away with it if you hijacked those macros as seen by xgettext and made them all "lld" everywhere since that's at least the right size on all known systems, but that'd be a bit gross and not in the spirit of this exercise...)
On 15.03.25 03:42, Thomas Munro wrote: >> - I don't think it's proper to assume that pgoff_t or off_t matches int64_t. > > So we should make pgoff_t a typedef for int64_t everywhere. It's a > bit annoying that we have to teach everyone to remember to use PRId64 > to print it, though. The ramifications of such a change are not clear to me. I thought pgoff_t is supposed to be off_t on Unix systems. If we change that, then how will this affect pointer type arguments, function pointers, etc. This seems to be a much larger problem than what this thread is originally about. I think we should leave pgoff_t the way it is (mostly?) done now: Cast to long long int and print using %lld. > How reasonable would it be to add an extra > filter into whatever script is used to run xgettext on our source > tree? It could replace a very small number of agreed useful tokens to > match some macros that we would also define in our tree, so that we > could write PRI_PGOFF_T in our messages, but xgettext would see PRId64 > and still emit those magic %<PRId64> tokens that GNU/NetBSD/Solaris > gettext() know how to translate on the fly when loading message > catalogues. This is not really possible. The <PRIxxx> behavior is baked deeply into the gettext code. (Also note that you don't only need support in xgettext, which is part of our build system, but also in the runtime library, which we don't control.)
Peter Eisentraut <peter@eisentraut.org> writes:
> This is not really possible. The <PRIxxx> behavior is baked deeply into
> the gettext code. (Also note that you don't only need support in
> xgettext, which is part of our build system, but also in the runtime
> library, which we don't control.)
Hmm, I find that comment fairly scary. How do we know that the
runtime library actually gets this right on every supported platform?
It's surely not because we test it, because we do not.
regards, tom lane
On Mon, Mar 17, 2025 at 8:09 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Peter Eisentraut <peter@eisentraut.org> writes: > > This is not really possible. The <PRIxxx> behavior is baked deeply into > > the gettext code. (Also note that you don't only need support in > > xgettext, which is part of our build system, but also in the runtime > > library, which we don't control.) > > Hmm, I find that comment fairly scary. How do we know that the > runtime library actually gets this right on every supported platform? > It's surely not because we test it, because we do not. I don't know too much about libintl and its history other than what I've looked up for these discussions, but I can't find any other implementations other than Sun's, GNU's and NetBSD's. Sun/Oracle and NetBSD went out of their way to understand these and other GNUisms. I am not sure if they should even be called "extensions"... extensions to what? I guess the historical answer would have been "Sun's version", but see below for a new development which raises philosophical questions. 1. Solaris -- the original implementation has special support for the things GNU's added, and certainly covers this <inttypes.h> stuff: https://docs.oracle.com/cd/E36784_01/html/E39536/gnkbn.html#ILDEVgnosj I just tried it out on a cfarm Solaris box (well I thought I already knew this from an earlier round of discussions about this but wanted to check again before replying and found my old test program still there...). Note the "<PRId64>" in the catalogue: tmunro@s11-sparc:~/gettext-hacking$ uname -a SunOS s11-sparc.cfarm 5.11 11.4.78.189.2 sun4v sparc sun4v logical-domain tmunro@s11-sparc:~/gettext-hacking$ tail -5 locales/fr/LC_MESSAGES/messages.po #: test.c:8 #, c-format msgid "the answer is %<PRId64>\n" msgstr "la réponse est %<PRId64>\n" tmunro@s11-sparc:~/gettext-hacking$ cat test.c #include <inttypes.h> #include <libintl.h> #include <locale.h> #include <stdio.h> #include <stdlib.h> #define GETTEXT_DOMAIN "messages" #define GETTEXT_OUTPUT_DIR "locales" int main() { setenv("LANGUAGE", "fr", 1); setlocale(LC_ALL, "fr_FR.UTF-8"); bindtextdomain(GETTEXT_DOMAIN, GETTEXT_OUTPUT_DIR); textdomain(GETTEXT_DOMAIN); printf(gettext("the answer is %" PRId64 "\n"), (int64_t) 42); } tmunro@s11-sparc:~/gettext-hacking$ gcc test.c tmunro@s11-sparc:~/gettext-hacking$ ./a.out la réponse est 42 You can also see that stuff in the illumos source tree: https://github.com/illumos/illumos-gate/blob/master/usr/src/lib/libc/port/i18n/gettext_gnu.c 2. NetBSD -- I haven't try it myself (I can send my test program if you are interested) but it clearly knows about GNU's system-dependent macros, and its stated goal was to be "based on the specifications from GNU gettext": https://wiki.netbsd.org/projects/project/libintl/ https://github.com/NetBSD/src/blob/trunk/lib/libintl/sysdep.c What aspect of that might not work portably? Are there any other implementations I'm missing? What standard would an implementation follow, if it were to exist? POSIX 2024 also finally standardised gettext() and associated tools. I don't see these macros mentioned there (after an admittedly cursory scan of the relevant sections), or for that matter any mention of the portability problem they solve (perhaps I'll write in about that), but it doesn't seem to make any sense to deprive ourselves of features supported by all known implementations that solve a real problem, just because a standard suddenly appeared retroactively rendering them "extensions" in some sense. I mean, GNU is clearly functioning as a of de facto standard of very long standing, which I think the POSIX discussion[1] acknowledged succinctly in the description field "POSIX defines catgets() but most software rather uses gettext()". I don't think I've ever seen catgets() in several decades around C and Unix. (Amusingly the GNU maintainer showed up to say (paraphrasing) "don't do it", and (paraphrasing) "if you want to solve a problem that we actually have why don't you add all the missing _l function so we can write portable multithreaded programs". Hear hear!) [1] https://www.austingroupbugs.net/view.php?id=1122
On Mon, Mar 17, 2025 at 8:03 PM Peter Eisentraut <peter@eisentraut.org> wrote: > On 15.03.25 03:42, Thomas Munro wrote: > > So we should make pgoff_t a typedef for int64_t everywhere. It's a > > bit annoying that we have to teach everyone to remember to use PRId64 > > to print it, though. > > The ramifications of such a change are not clear to me. I thought > pgoff_t is supposed to be off_t on Unix systems. If we change that, > then how will this affect pointer type arguments, function pointers, > etc. This seems to be a much larger problem than what this thread is > originally about. > > I think we should leave pgoff_t the way it is (mostly?) done now: Cast > to long long int and print using %lld. WFM. > > How reasonable would it be to add an extra > > filter into whatever script is used to run xgettext on our source > > tree? It could replace a very small number of agreed useful tokens to > > match some macros that we would also define in our tree, so that we > > could write PRI_PGOFF_T in our messages, but xgettext would see PRId64 > > and still emit those magic %<PRId64> tokens that GNU/NetBSD/Solaris > > gettext() know how to translate on the fly when loading message > > catalogues. > > This is not really possible. The <PRIxxx> behavior is baked deeply into > the gettext code. (Also note that you don't only need support in > xgettext, which is part of our build system, but also in the runtime > library, which we don't control.) Hmm, but that's why I was asking about filtering the source *before* xgettext sees it, but it sounds like I may still be confused about how that works and I'm very happy to abandon that idea and leave those bits unchanged. Will update the patch shortly to incorporate your other feedback. Thanks!
On Mon, Mar 17, 2025 at 11:52 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> tmunro@s11-sparc:~/gettext-hacking$ gcc test.c
> tmunro@s11-sparc:~/gettext-hacking$ ./a.out
> la réponse est 42
And just to be paranoid, I checked a few more things: the .mo
definitely contains the literal "PRId64" (rearranged as
"^@PRId64^@the answer is %") and it's definitely using gettext() from
libc and not somehow automatically finding a GNU library in some
search path. (And woop, this cfarm Sun box has received the new
preadv()/pwritev() in its libc, that they added for PostgreSQL.)
And since I remembered that I had a NetBSD vagrant VM handy from
investigating Champion's libpq troubles the other day:
[vagrant@netbsd9 gettext-hacking]$ cc test.c -lintl
[vagrant@netbsd9 gettext-hacking]$ ldd a.out
a.out:
-lintl.1 => /usr/lib/libintl.so.1
-lc.12 => /usr/lib/libc.so.12
[vagrant@netbsd9 gettext-hacking]$ ./a.out
la réponse est 42
Not that I had much doubt but I checked that the library is indeed the
NetBSD code and not somehow GNU code, based on clearly identifiable
strings.
Thomas Munro <thomas.munro@gmail.com> writes:
> On Mon, Mar 17, 2025 at 8:09 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Hmm, I find that comment fairly scary. How do we know that the
>> runtime library actually gets this right on every supported platform?
> I don't know too much about libintl and its history other than what
> I've looked up for these discussions, but I can't find any other
> implementations other than Sun's, GNU's and NetBSD's. Sun/Oracle and
> NetBSD went out of their way to understand these and other GNUisms.
Okay, that reduces the size of the problem considerably.
> 2. NetBSD -- I haven't try it myself (I can send my test program if
> you are interested)
I'd be happy to try it, but I see downthread that you already did,
so that seems unnecessary.
I still wonder if we shouldn't have more than zero testing of our
NLS behavior, but that's just a generalized worry not a concern
over any specific feature.
regards, tom lane
On 02.03.25 22:08, Thomas Munro wrote: > And one more thing like that: in a couple of places we see warnings on > macOS CI that I'd missed: when printing the result of i64abs() as > PRId64, because it happens to use labs() and it happens to define > int64_t as long long, [...]. I > suppose we should cast to int64 in the definition of c.h's i64abs() > macro I have committed a fix for that.
On 10.03.25 10:49, Peter Eisentraut wrote:
> On 02.03.25 22:08, Thomas Munro wrote:
>> On Mon, Mar 3, 2025 at 6:21 AM Peter Eisentraut <peter@eisentraut.org>
>> wrote:
>>> On 05.12.24 23:18, Thomas Munro wrote:
>>>> Old: errmsg("hello %llu", (unsigned long long) x)
>>>> New: errmsg("hello %" PRIu64, x)
>>>
>>> I have committed the subset of this patch for pg_checksums.c so that the
>>> translators and whoever else might be affected can try this out at small
>>> scale. (I don't expect any particular problems.) Then we can move on
>>> to the rest in a few weeks, I think.
>>
>> Good plan, thanks. Here's a rebase.
>
> I think this went ok, and we can proceed here.
>
> I looked through the v2-0001 patch in detail. Most of it is mechanical,
> so no problems. I couple of issues you already mentioned:
I have committed v2-0001, omitting the parts that I had flagged in my
review. I have also committed v2-0002. From my perspective, this can
conclude this thread.
On Sun, Mar 30, 2025 at 6:24 AM Peter Eisentraut <peter@eisentraut.org> wrote: > I have committed v2-0001, omitting the parts that I had flagged in my > review. I have also committed v2-0002. From my perspective, this can > conclude this thread. Thank you! Fingers crossed that the translation updates go as smoothly as predicted.
On Tue, Mar 18, 2025 at 4:48 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Thomas Munro <thomas.munro@gmail.com> writes: > > On Mon, Mar 17, 2025 at 8:09 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> Hmm, I find that comment fairly scary. How do we know that the > >> runtime library actually gets this right on every supported platform? > > > I don't know too much about libintl and its history other than what > > I've looked up for these discussions, but I can't find any other > > implementations other than Sun's, GNU's and NetBSD's. Sun/Oracle and > > NetBSD went out of their way to understand these and other GNUisms. > > Okay, that reduces the size of the problem considerably. For the record, I found one more hiding on Alpine Linux. It has two implementations available: 1. You can install the regular GNU library and tools with "gettext" and "gettext-dev". Then I assume this all just works. 2. There is a cleanroom implementation reachable with "musl-libintl". It supplies a different /usr/include/libintl.h that activates an implementation inside musl itself, and IIRC "gettext-tiny" gives you msgfmt etc. I bet this new 64-bit stuff doesn't work though: they do support %<PRIu64> etc, but the lookup table[1] seems a little on the short side for our usage. gettext-tiny's README.md also explains that they pre-chew them in msgfmt, so GNU/non-GNU combinations probably break once you start using these macros, if they ever worked. It looks like the packages and docker scripts people use to run PostgreSQL on Alpine don't enable nls anyway, so I doubt we'll hear anything about this from the field. It might still be interesting to know which msgfmt/libintl pair the BF animals are using (CC Wolfgang), and whether this stuff actually works. If not, installing "gettext" and "gettext-dev" would likely be the solution, though if the musl version already works for %<PRIu64>, perhaps a small PR could get the other variants to work too? [1] https://github.com/sabotage-linux/gettext-tiny/blob/master/src/poparser.c
Thomas Munro: > It might still be interesting to > know which msgfmt/libintl pair the BF animals are using (CC Wolfgang), > and whether this stuff actually works. It took a while until I was able to enable NLS on the alpine animals [1]. It looks like I used gettext-tiny-dev here. I just tried with regular gettext-dev again in [2], and that doesn't work, yet. I get something like this: /usr/lib/gcc/x86_64-alpine-linux-musl/14.2.0/../../../../x86_64-alpine-linux-musl/bin/ld: ../../src/port/libpgport.a(strerror.o): in function `pg_strerror_r': /mnt/build/HEAD/pgsql.build/../pgsql/src/port/strerror.c:72:(.text+0x260): undefined reference to `libintl_gettext' /usr/lib/gcc/x86_64-alpine-linux-musl/14.2.0/../../../../x86_64-alpine-linux-musl/bin/ld: /mnt/build/HEAD/pgsql.build/../pgsql/src/port/strerror.c:72:(.text+0x2e0): undefined reference to `libintl_gettext' Logs available in [3] as artifacts. So currently the only way I can make this work is with gettext-tiny-dev. Best, Wolfgang [1]: https://github.com/technowledgy/postgresql-buildfarm-alpine/pull/12 [2]: https://github.com/technowledgy/postgresql-buildfarm-alpine/pull/115 [3]: https://github.com/technowledgy/postgresql-buildfarm-alpine/actions/runs/19034584598