Re: Inconsistent printf placeholders
От | Kyotaro Horiguchi |
---|---|
Тема | Re: Inconsistent printf placeholders |
Дата | |
Msg-id | 20240321.171632.804394642968687545.horikyota.ntt@gmail.com обсуждение исходный текст |
Ответ на | Re: Inconsistent printf placeholders (Peter Eisentraut <peter@eisentraut.org>) |
Список | pgsql-hackers |
Thank you for looking this. At Tue, 19 Mar 2024 10:50:23 +0100, Peter Eisentraut <peter@eisentraut.org> wrote in > On 15.03.24 08:20, Kyotaro Horiguchi wrote: > > diff --git a/src/backend/access/transam/twophase.c > > b/src/backend/access/transam/twophase.c > > @@ -1369,8 +1369,8 @@ ReadTwoPhaseFile(TransactionId xid, bool > > missing_ok) > > errmsg("could not read file \"%s\": > > %m", path))); > > else > > ereport(ERROR, > > - (errmsg("could not read file \"%s\": > > - read %d of %lld", > > - path, r, (long long > > - int) stat.st_size))); > > + (errmsg("could not read file \"%s\": read %zd of %zu", > > + path, r, (Size) stat.st_size))); > > } > > pgstat_report_wait_end(); > > This might be worse, because stat.st_size is of type off_t, which > could be smaller than Size/size_t. I think you were trying to mention that off_t could be wider than size_t and you're right in that point. I thought that it is safe since we are trying to read the whole content of the file at once here into palloc'ed memory. However, on second thought, if st_size is out of the range of ssize_t, and palloc accepts that size, at least on Linux, read(2) reads only 0x7ffff000 bytes and raches the error reporting. Addition to that, this size was closer to the memory allocation size limit than I had thought. As the result, I removed the change. However, I kept the change of the type of variable "r" and corresponding placeholder %zd. > > diff --git a/src/backend/libpq/be-secure-gssapi.c > > b/src/backend/libpq/be-secure-gssapi.c > > index bc04e78abb..68645b4519 100644 > > --- a/src/backend/libpq/be-secure-gssapi.c > > +++ b/src/backend/libpq/be-secure-gssapi.c > > @@ -572,9 +572,9 @@ secure_open_gssapi(Port *port) > > if (input.length > PQ_GSS_RECV_BUFFER_SIZE) > > { > > ereport(COMMERROR, > > - (errmsg("oversize GSSAPI packet sent > > - by the client (%zu > %d)", > > + (errmsg("oversize GSSAPI packet sent by the client (%zu > %zu)", > > (size_t) input.length, > > - PQ_GSS_RECV_BUFFER_SIZE))); > > + (size_t) PQ_GSS_RECV_BUFFER_SIZE))); > > return -1; > > } > > > > Might be better to add that cast to the definition of > PQ_GSS_RECV_BUFFER_SIZE instead, so that all code can benefit. As far as I see, the only exceptional uses I found were a comparison with int values, and being passed as an OM_uint32 (to one of the parameters of gss_wrap_size_limit()). Therefore, I agree that it is beneficial. By the way, we currently define Size as the same as size_t (since 1998). Is it correct to consider Size as merely for backward compatibility and we should use size_t for new code? I used size_t in the modified part in the attached patch. > > diff --git a/src/backend/replication/repl_gram.y > > b/src/backend/replication/repl_gram.y > > index 7474f5bd67..baa76280b9 100644 > > --- a/src/backend/replication/repl_gram.y > > +++ b/src/backend/replication/repl_gram.y > > @@ -312,11 +312,6 @@ timeline_history: > > { > > TimeLineHistoryCmd *cmd; > > - if ($2 <= 0) > > - ereport(ERROR, > > - (errcode(ERRCODE_SYNTAX_ERROR), > > - errmsg("invalid > > - timeline %u", > > - $2))); > > - ... > I don't think this is correct. It loses the check for == 0. Ugh. It's my mistake. So we need to consider unifying the messages again. In walsummaryfuncs.c, %lld is required, but it's silly for the uses in repl_gram.y. Finally, I chose not to change anything here. > > diff --git a/src/backend/tsearch/to_tsany.c > > b/src/backend/tsearch/to_tsany.c > > index 88cba58cba..9d21178107 100644 > > --- a/src/backend/tsearch/to_tsany.c > > +++ b/src/backend/tsearch/to_tsany.c > > @@ -191,7 +191,8 @@ make_tsvector(ParsedText *prs) > > if (lenstr > MAXSTRPOS) > > ereport(ERROR, > > (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED), > > - errmsg("string is too long for tsvector (%d > > - bytes, max %d bytes)", lenstr, MAXSTRPOS))); > > + /* cast values to avoid extra translatable messages */ > > + errmsg("string is too long for tsvector (%ld bytes, max %ld bytes)", > > (long) lenstr, (long) MAXSTRPOS))); > > totallen = CALCDATASIZE(prs->curwords, lenstr); > > in = (TSVector) palloc0(totallen); > > I think it would be better instead to change the message in > tsvectorin() to *not* use long. The size of long is unportable, so I > would rather avoid using it at all. The casts to long are tentative only to adjust to the corresponding placeholder, and in this context, portability concerns are not applicable. However, those casts are apparently useless. As you suggested, I tried to change tsvectorin() instead, but there's a problem here. tsvector.c:224 > errmsg("string is too long for tsvector (%ld bytes, max %ld bytes)", > (long) (cur - tmpbuf), (long) MAXSTRPOS))); cur and tmpbuf are pointers. The byte width of the subtraction results varies by architecture. However, the surrounding code apparently assumes that the difference fits within an int. I added a cast to int for the pointer arithmetic here. (Although I'm not sure this is the right direction.) > > diff --git a/src/backend/utils/adt/varlena.c > > b/src/backend/utils/adt/varlena.c > > index 8d28dd42ce..5de490b569 100644 > > --- a/src/backend/utils/adt/varlena.c > > +++ b/src/backend/utils/adt/varlena.c > > @@ -3217,8 +3217,9 @@ byteaGetByte(PG_FUNCTION_ARGS) > > if (n < 0 || n >= len) > > ereport(ERROR, > > (errcode(ERRCODE_ARRAY_SUBSCRIPT_ERROR), > > - errmsg("index %d out of valid range, 0..%d", > > - n, len - 1))); > > + /* cast values to avoid extra translable messages */ > > + errmsg("index %lld out of valid range, 0..%lld", > > + (long long)n, (long long) len - 1))); > > byte = ((unsigned char *) VARDATA_ANY(v))[n]; > > I think this is taking it too far. We shouldn't try to make all > similar messages use the same placeholders. If the underlying types > are different, we should use them. Adding more casts makes the code > less robust overall. The size_t/ssize_t cleanup is different, because > there the types were arguably wrong to begin with, and by using the > right types we move toward more consistency. Ouch! Understood. They treat byte and bit locations accordingly. I agree that it's too far. Removed. > > diff --git a/src/bin/pg_combinebackup/pg_combinebackup.c b/src/bin/pg_combinebackup/pg_combinebackup.c > > index 6f0814d9ac..feb4d5dcf4 100644 > > --- a/src/bin/pg_combinebackup/pg_combinebackup.c > > +++ b/src/bin/pg_combinebackup/pg_combinebackup.c > > - pg_fatal("could not read file \"%s\": read only %zd of %lld bytes", > > - filename, rb, (long long int) st.st_size); > > + /* cast st_size to avoid extra translatable messages */ > > + pg_fatal("could not read file \"%s\": read only %zd of %zu bytes", > > + filename, rb, (size_t) st.st_size); > > } > > /* Adjust buffer length for new data and restore trailing-\0 invariant */ > > Similar to above, casting off_t to size_t is dubious. The same discussion regarding the change in twophase.c is also applicable to this change. I applied the same amendment. > > diff --git a/src/port/user.c b/src/port/user.c > > index 7444aeb64b..9364bdb69e 100644 > > --- a/src/port/user.c > > +++ b/src/port/user.c > > @@ -40,8 +40,8 @@ pg_get_user_name(uid_t user_id, char *buffer, size_t > > buflen) > > } > > if (pwerr != 0) > > snprintf(buffer, buflen, > > - _("could not look up local user ID %d: %s"), > > - (int) user_id, > > + _("could not look up local user ID %ld: %s"), > > + (long) user_id, > > strerror_r(pwerr, pwdbuf, sizeof(pwdbuf))); > > else > > snprintf(buffer, buflen, > > Also dubious use of "long" here. Okay, used %d instead. In addition to that, I removed the casts from uid_t expecting that compilers will detect the change of the definition of uid_t. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
Вложения
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Peter EisentrautДата:
Сообщение: Re: Trying to build x86 version on windows using meson