Обсуждение: Re: Add comment explaining why queryid is int64 in pg_stat_statements

Поиск

Список

Период

Сортировка

Re: Add comment explaining why queryid is int64 in pg_stat_statements

От

Ilia Evdokimov

Дата:

16 мая 2025 г., 12:42:41

On 15.05.2025 10:08, Shaik Mohammad Mujeeb wrote:

Hi Developers,

In pg_stat_statements.c, the function pg_stat_statements_internal() declares the queryid variable as int64, but assigns it a value of type uint64. At first glance, this might appear to be an overflow issue. However, I think this is intentional - the queryid is cast to int64 to match the bigint type of the queryid column in the pg_stat_statements view.

Please find the attached patch, which adds a clarifying comment to make the rationale explicit and avoid potential confusion for future readers.

Thanks and Regards,
Shaik Mohammad Mujeeb
Member Technical Staff
Zoho Corp

I don't think the comment is necessary here. There are no arithmetic or logical operations performed on it. It is only passed as a Datum.

--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC.

Re: Add comment explaining why queryid is int64 in pg_stat_statements

От

Shaik Mohammad Mujeeb

Дата:

16 мая 2025 г., 13:35:01

Hi Ilia Evdokimov,

While it's true that no arithmetic or logical operations are performed on queryid after the assignment, the overflow technically occurs at the point of assignment itself. For example, entry->key.queryid holds the value 12747288675711951805 as a uint64, but after assigning it to queryid (which is an int64), it becomes -5699455397997599811 due to overflow.

This conversion is intentional - most likely to match the bigint type of the queryid column in pg_stat_statements. However, without an explicit comment, this can be misleading. A beginner reading this might misinterpret it as an unintentional overflow or bug and raise unnecessary concerns. Therefore, it’s worth adding a brief comment clarifying the intent behind this assignment.

Thanks & Regards,
Shaik Mohammad Mujeeb

Member Technical Staff

Zoho Corp

---- On Fri, 16 May 2025 15:12:41 +0530 Ilia Evdokimov <ilya.evdokimov@tantorlabs.com> wrote ---

On 15.05.2025 10:08, Shaik Mohammad Mujeeb wrote:

I don't think the comment is necessary here. There are no arithmetic or logical operations performed on it. It is only passed as a Datum.
--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC.
Hi Developers,

In pg_stat_statements.c, the function pg_stat_statements_internal() declares the queryid variable as int64, but assigns it a value of type uint64. At first glance, this might appear to be an overflow issue. However, I think this is intentional - the queryid is cast to int64 to match the bigint type of the queryid column in the pg_stat_statements view.

Please find the attached patch, which adds a clarifying comment to make the rationale explicit and avoid potential confusion for future readers.

Thanks and Regards,
Shaik Mohammad Mujeeb
Member Technical Staff
Zoho Corp

Re: Add comment explaining why queryid is int64 in pg_stat_statements

От

wenhui qiu

Дата:

17 мая 2025 г., 06:36:19

Hi Shaik

> While it's true that no arithmetic or logical operations are performed on queryid after the assignment, the overflow technically > occurs at the point of assignment itself. For example, entry->key.queryid holds the value 12747288675711951805 as a

> uint64, but after assigning it to queryid (which is an int64), it becomes -5699455397997599811 due to overflow.

> This conversion is intentional - most likely to match the bigint type of the queryid column in pg_stat_statements. However,

> without an explicit comment, this can be misleading. A beginner reading this might misinterpret it as an unintentional overflow > or bug and raise unnecessary concerns. Therefore, it’s worth adding a brief comment clarifying the intent behind this

> assignment.

+1 agree

On Sat, May 17, 2025 at 1:54 AM Shaik Mohammad Mujeeb <mujeeb.sk@zohocorp.com> wrote:

Hi Ilia Evdokimov,

While it's true that no arithmetic or logical operations are performed on queryid after the assignment, the overflow technically occurs at the point of assignment itself. For example, entry->key.queryid holds the value 12747288675711951805 as a uint64, but after assigning it to queryid (which is an int64), it becomes -5699455397997599811 due to overflow.

This conversion is intentional - most likely to match the bigint type of the queryid column in pg_stat_statements. However, without an explicit comment, this can be misleading. A beginner reading this might misinterpret it as an unintentional overflow or bug and raise unnecessary concerns. Therefore, it’s worth adding a brief comment clarifying the intent behind this assignment.

Thanks & Regards,
Shaik Mohammad Mujeeb
Member Technical Staff
Zoho Corp

---- On Fri, 16 May 2025 15:12:41 +0530 Ilia Evdokimov <ilya.evdokimov@tantorlabs.com> wrote ---

On 15.05.2025 10:08, Shaik Mohammad Mujeeb wrote:

I don't think the comment is necessary here. There are no arithmetic or logical operations performed on it. It is only passed as a Datum.
--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC.
Hi Developers,

In pg_stat_statements.c, the function pg_stat_statements_internal() declares the queryid variable as int64, but assigns it a value of type uint64. At first glance, this might appear to be an overflow issue. However, I think this is intentional - the queryid is cast to int64 to match the bigint type of the queryid column in the pg_stat_statements view.

Please find the attached patch, which adds a clarifying comment to make the rationale explicit and avoid potential confusion for future readers.

Thanks and Regards,
Shaik Mohammad Mujeeb
Member Technical Staff
Zoho Corp

Re: Add comment explaining why queryid is int64 in pg_stat_statements

От

Shaik Mohammad Mujeeb

Дата:

17 мая 2025 г., 21:35:56

Hi Michael Paquier,

> I don't quite see the value in the comment addition you are suggesting

> here: all the user-facing features related to query IDs use signed 64b

> integers, and are documented as such in the official docs. The fact

> that we store an unsigned value in the backend core code is an

> internal implementation artifact, the important point is that we have

> a value stored in 8 bytes.

Originally, queryId used to be of type uint32, which fit into int64 without any overflow. However, after commit cff440d36 (which changed queryId to uint64), overflow started happening when assigning it to an int64. In that commit, queryId was updated to uint64 in most places - except in pg_stat_statements_internal(). Given that this was an intentional choice, I feel a brief comment should have been included in that commit itself to clarify the reasoning.

Although the commit was made back in 2017, this discrepancy still causes confusion. I found myself revisiting and analyzing it simply because there was no explanatory comment. Others might raise the same question again in the future unless this is clarified explicitly.

I also noticed that a similar comment is already present in

ExplainPrintPlan():

* Output the queryid as an int64 rather than a uint64 so we match

* what would be seen in the BIGINT pg_stat_statements.queryid column.

Since such a comment exists there, I believe it would be equally reasonable and helpful to include a similar one in pg_stat_statements_internal() as well.

Of course, if this still seems unnecessary, I’m happy to drop the suggestion - just wanted to put it forward in case it helps reduce ambiguity for future readers or contributors.

Thanks and Regards,

Shaik Mohammad Mujeeb

Member Technical Staff

Zoho Corp

---- On Sat, 17 May 2025 18:19:23 +0530 Michael Paquier <michael@paquier.xyz> wrote ---

On Fri, May 16, 2025 at 04:05:01PM +0530, Shaik Mohammad Mujeeb wrote:
> This conversion is intentional - most likely to match the bigint
> type of the queryid column in pg_stat_statements. However, without
> an explicit comment, this can be misleading. A beginner reading this
> might misinterpret it as an unintentional overflow or bug and raise
> unnecessary concerns. Therefore, it´s worth adding a brief comment
> clarifying the intent behind this assignment.

I don't quite see the value in the comment addition you are suggesting
here: all the user-facing features related to query IDs use signed 64b
integers, and are documented as such in the official docs. The fact
that we store an unsigned value in the backend core code is an
internal implementation artifact, the important point is that we have
a value stored in 8 bytes.
--
Michael

Re: Add comment explaining why queryid is int64 in pg_stat_statements

От

Junwang Zhao

Дата:

18 мая 2025 г., 15:17:06

On Sun, May 18, 2025 at 3:48 AM Shaik Mohammad Mujeeb <mujeeb.sk@zohocorp.com> wrote:

Hi Michael Paquier,

> I don't quite see the value in the comment addition you are suggesting
> here: all the user-facing features related to query IDs use signed 64b
> integers, and are documented as such in the official docs. The fact
> that we store an unsigned value in the backend core code is an
> internal implementation artifact, the important point is that we have
> a value stored in 8 bytes.

Originally, queryId used to be of type uint32, which fit into int64 without any overflow. However, after commit cff440d36 (which changed queryId to uint64), overflow started happening when assigning it to an int64. In that commit, queryId was updated to uint64 in most places - except in pg_stat_statements_internal(). Given that this was an intentional choice, I feel a brief comment should have been included in that commit itself to clarify the reasoning.

As stated in the commit message, changing the representation of queryId

from uint32 to uint64 significantly reduces the chances of two different

queries generating the same queryId.

Since queryId is not incremented sequentially, the overflow issue you referenced
should not be a concern. Furthermore, converting between int64 and uint64

does not result in any loss of information.

Although the commit was made back in 2017, this discrepancy still causes confusion. I found myself revisiting and analyzing it simply because there was no explanatory comment. Others might raise the same question again in the future unless this is clarified explicitly.

I also noticed that a similar comment is already present in ExplainPrintPlan():
/*
* Output the queryid as an int64 rather than a uint64 so we match
* what would be seen in the BIGINT pg_stat_statements.queryid column.
*/

Since such a comment exists there, I believe it would be equally reasonable and helpful to include a similar one in pg_stat_statements_internal() as well.

Of course, if this still seems unnecessary, I’m happy to drop the suggestion - just wanted to put it forward in case it helps reduce ambiguity for future readers or contributors.

Thanks and Regards,
Shaik Mohammad Mujeeb
Member Technical Staff
Zoho Corp

---- On Sat, 17 May 2025 18:19:23 +0530 Michael Paquier <michael@paquier.xyz> wrote ---

On Fri, May 16, 2025 at 04:05:01PM +0530, Shaik Mohammad Mujeeb wrote:
> This conversion is intentional - most likely to match the bigint
> type of the queryid column in pg_stat_statements. However, without
> an explicit comment, this can be misleading. A beginner reading this
> might misinterpret it as an unintentional overflow or bug and raise
> unnecessary concerns. Therefore, it´s worth adding a brief comment
> clarifying the intent behind this assignment.

I don't quite see the value in the comment addition you are suggesting
here: all the user-facing features related to query IDs use signed 64b
integers, and are documented as such in the official docs. The fact
that we store an unsigned value in the backend core code is an
internal implementation artifact, the important point is that we have
a value stored in 8 bytes.
--
Michael

Regards

Junwang Zhao

Re: Add comment explaining why queryid is int64 in pg_stat_statements

От

Peter Eisentraut

Дата:

19 мая 2025 г., 18:05:30

On 17.05.25 14:49, Michael Paquier wrote:
> On Fri, May 16, 2025 at 04:05:01PM +0530, Shaik Mohammad Mujeeb wrote:
>> This conversion is intentional - most likely to match the bigint
>> type of the queryid column in pg_stat_statements. However, without
>> an explicit comment, this can be misleading. A beginner reading this
>> might misinterpret it as an unintentional overflow or bug and raise
>> unnecessary concerns. Therefore, it´s worth adding a brief comment
>> clarifying the intent behind this assignment.
> 
> I don't quite see the value in the comment addition you are suggesting
> here: all the user-facing features related to query IDs use signed 64b
> integers, and are documented as such in the official docs.  The fact
> that we store an unsigned value in the backend core code is an
> internal implementation artifact, the important point is that we have
> a value stored in 8 bytes.

There is another, new instance of this confusion.  When query IDs are 
printed to the log, we use a signed-integer format, but the value is 
unsigned (see log_status_format(), write_jsonlog()).  This works 
correctly in practice, but it triggers warnings from 
-Wwarnings-format-signedness (not in the default warnings set, but 
useful to clean up occasionally) and maybe other tools along those 
lines.  This is new because in the past this warning was hidden by 
additional casts.  If we want to keep this arrangement, maybe we should 
create an explicit function to convert query IDs for output, and then 
explain all this there.  Or why not store query IDs as int64_t 
internally, too?

Re: Add comment explaining why queryid is int64 in pg_stat_statements

От

Sami Imseih

Дата:

19 мая 2025 г., 19:11:37

FWIW, all the hash function SQL interfaces, \df hash*,
have this behavior in which the result is a signed (int/bigint),
but the internal representation of the hash is an unsigned (int/bigint).

I am not sure why a comment is needed specifically for pg_stat_statements


--
Sami Imseih
Amazon Web Services (AWS)

Re: Add comment explaining why queryid is int64 in pg_stat_statements

От

Michael Paquier

Дата:

20 мая 2025 г., 06:12:36

On Tue, May 20, 2025 at 02:03:37PM +1200, David Rowley wrote:
> Aside from the struct field types changing for Query.queryId,
> PlannedStmt.queryId and PgBackendStatus.st_query_id, the
> external-facing changes are limited to:
>
> 1. pgstat_report_query_id() now returns int64 instead of uint64
> 2. pgstat_get_my_query_id()  now returns int64 instead of uint64
> 3. pgstat_report_query_id()'s first input parameter is now int64
>
> If we were to clean this up, then I expect it's fine to wait until
> v19, as it's not really a problem that's new to v18.

Hmm.  For the query ID, that's not new, but for the plan ID it is.  So
it seems to me that there could be also an argument for doing all that
in v18 rather than releasing v18 with the plan ID being unsigned,
keeping a maximum of consistency for both of IDs.  AFAIK, this is
something that Lukas's plan storing extension exposes as an int64
value to the user and the SQL interfaces, even if it's true that we
don't expose it in core, yet.
--
Michael

Вложения

signature.asc

Re: Add comment explaining why queryid is int64 in pg_stat_statements

От

Lukas Fittl

Дата:

20 мая 2025 г., 06:43:25

On Mon, May 19, 2025 at 8:12 PM Michael Paquier <michael@paquier.xyz> wrote:

On Tue, May 20, 2025 at 02:03:37PM +1200, David Rowley wrote:
> Aside from the struct field types changing for Query.queryId,
> PlannedStmt.queryId and PgBackendStatus.st_query_id, the
> external-facing changes are limited to:
>
> 1. pgstat_report_query_id() now returns int64 instead of uint64
> 2. pgstat_get_my_query_id() now returns int64 instead of uint64
> 3. pgstat_report_query_id()'s first input parameter is now int64
>
> If we were to clean this up, then I expect it's fine to wait until
> v19, as it's not really a problem that's new to v18.

Hmm. For the query ID, that's not new, but for the plan ID it is. So
it seems to me that there could be also an argument for doing all that
in v18 rather than releasing v18 with the plan ID being unsigned,
keeping a maximum of consistency for both of IDs. AFAIK, this is
something that Lukas's plan storing extension exposes as an int64
value to the user and the SQL interfaces, even if it's true that we
don't expose it in core, yet.

Yeah, +1 to making this consistent across both query ID and the new plan ID, and changing both to int64 internally seems best.

Just as an example, in my current work-in-progress version of a new pg_stat_plans extension the plan ID gets set like this ([0]):

static void
pgsp_calculate_plan_id(PlannedStmt *result)
{
...
result->planId = HashJumbleState(jstate);
...
}

With HashJumbleState currently returning a uint64.

Similarly, when reading out the planID from backends, we do:

values[4] = UInt64GetDatum(beentry->st_plan_id);

if Postgres 18 released as-is, and 19 changed this, we'd have to carry a version-specific cast from uint64 to int64 in both places. Not a big deal, but certainly nice to not knowingly introduce this confusion for extension development.

As an additional data point, the existing pg_stat_monitor extension uses a uint64 for planID [1], and pg_store_plans uses a uint32 [2]. I'd expect both of these to update to a int64 internally with the new planID being available in 18.

Thanks,

Lukas

[0]: https://github.com/pganalyze/pg_stat_plans/blob/main/pg_stat_plans.c#L745

[1]: https://github.com/percona/pg_stat_monitor/blob/9c72c2e73d83a6b687ff8e2ddc5072c63afe983b/pg_stat_monitor.h#L212

[2]: https://github.com/ossc-db/pg_store_plans/blob/master/pg_store_plans.c#L129

Lukas Fittl

Re: Add comment explaining why queryid is int64 in pg_stat_statements

От

Michael Paquier

Дата:

20 мая 2025 г., 08:09:13

On Mon, May 19, 2025 at 08:43:25PM -0700, Lukas Fittl wrote:
> Yeah, +1 to making this consistent across both query ID and the new plan
> ID, and changing both to int64 internally seems best.

Any thoughts from others?  I'm OK to take the extra step of switching
both fields on HEAD and write a patch for that, relying on what David
has sent as a first step towards that.
--
Michael

Вложения

signature.asc

Re: Add comment explaining why queryid is int64 in pg_stat_statements

От

David Rowley

Дата:

20 мая 2025 г., 08:51:51

On Tue, 20 May 2025 at 17:09, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Mon, May 19, 2025 at 08:43:25PM -0700, Lukas Fittl wrote:
> > Yeah, +1 to making this consistent across both query ID and the new plan
> > ID, and changing both to int64 internally seems best.
>
> Any thoughts from others?  I'm OK to take the extra step of switching
> both fields on HEAD and write a patch for that, relying on what David
> has sent as a first step towards that.

Given the planId stuff is new and has the same issue, I think that
pushes me towards thinking now is better than later for fixing both.

I'm happy to adjust my patch unless you've started working on it already.

David

Re: Add comment explaining why queryid is int64 in pg_stat_statements

От

Julien Rouhaud

Дата:

20 мая 2025 г., 09:12:13

On Tue, May 20, 2025 at 02:09:13PM +0900, Michael Paquier wrote:
> On Mon, May 19, 2025 at 08:43:25PM -0700, Lukas Fittl wrote:
> > Yeah, +1 to making this consistent across both query ID and the new plan
> > ID, and changing both to int64 internally seems best.
>
> Any thoughts from others?  I'm OK to take the extra step of switching
> both fields on HEAD and write a patch for that, relying on what David
> has sent as a first step towards that.

I don't think it was discussed, but why not go the other way, keep everything
as uint64 and expose an uint64 datatype at the SQL level instead?  That's
something I actually want pretty often so I would be happy to have that
natively, whether it's called bigoid, oid8 or anything else.  That's probably
too late for pg18 though.

Re: Add comment explaining why queryid is int64 in pg_stat_statements

От

David Rowley

Дата:

20 мая 2025 г., 14:28:17

On Tue, 20 May 2025 at 18:12, Julien Rouhaud <rjuju123@gmail.com> wrote:
> I don't think it was discussed, but why not go the other way, keep everything
> as uint64 and expose an uint64 datatype at the SQL level instead?  That's
> something I actually want pretty often so I would be happy to have that
> natively, whether it's called bigoid, oid8 or anything else.  That's probably
> too late for pg18 though.

Certainly, a bit late, yes. It basically requires implementing
unsigned types on the SQL level. I believe there are a few sunken
ships along that coastline, and plenty of history in the archives if
you want to understand some of the difficulties.

David

Re: Add comment explaining why queryid is int64 in pg_stat_statements

От

Sami Imseih

Дата:

20 мая 2025 г., 18:35:39

Supporting unsigned INTs will be valuable for more than just this
obviously, and if we do
ever have that capability, we would likely want to go that route. I
know I've been asked about
why queryIds could be negative more than a few times. Until then, I
think the patch being
suggested is reasonable and cleaner.

A few comments on the patches:

1/ this was missed in pg_stat_statements
./contrib/pg_stat_statements/pg_stat_statements.c:    uint64
saved_queryId = pstmt->queryId;

2/ Should "DatumGetInt64(hash_any_extended(......" be turned into a
macro since it's used in
several places? Also, we can add a comment in the macro such as
"
Output the queryId as an int64 rather than a uint64, to match the
BIGINT column used to emit
queryId in system views
"

I think this comment is needed to address the confusion that is the
original subject of this thread. Otherwise,
the confusion will be moved from pg_stat_statements.c to queryjumblefuncs.c

DoJumble(JumbleState *jstate, Node *node)
{
/* Jumble the given node */
@@ -208,9 +208,9 @@ DoJumble(JumbleState *jstate, Node *node)
FlushPendingNulls(jstate);
/* Process the jumble buffer and produce the hash value */
- return DatumGetUInt64(hash_any_extended(jstate->jumble,
- jstate->jumble_len,
- 0));
+ return DatumGetInt64(hash_any_extended(jstate->jumble,
+ jstate->jumble_len,
+ 0));
}
/*
@@ -256,10 +256,10 @@ AppendJumbleInternal(JumbleState *jstate, const
unsigned char *item,
if (unlikely(jumble_len >= JUMBLE_SIZE))
{
- uint64 start_hash;
+ int64 start_hash;
- start_hash = DatumGetUInt64(hash_any_extended(jumble,
- JUMBLE_SIZE, 0));
+ start_hash = DatumGetInt64(hash_any_extended(jumble,
+ JUMBLE_SIZE, 0));
memcpy(jumble, &start_hash, sizeof(start_hash));
jumble_len = sizeof(start_hash);

--
Sami Imseih
Amazon Web Services (AWS)

Re: Add comment explaining why queryid is int64 in pg_stat_statements

От

Michael Paquier

Дата:

21 мая 2025 г., 03:34:02

On Tue, May 20, 2025 at 11:28:17PM +1200, David Rowley wrote:
> Certainly, a bit late, yes. It basically requires implementing
> unsigned types on the SQL level. I believe there are a few sunken
> ships along that coastline, and plenty of history in the archives if
> you want to understand some of the difficulties.

Providing some more context and some more information than this reply,
the latest thread that I can find on the matter is this one, from
December 2024:
https://www.postgresql.org/message-id/CAN3gO4sPBKbfWYK10i294u3kzsfDb4WX891FMbjLnKjMS08u7A%40mail.gmail.com

It summarizes nicely the situation and it contains some more general
points.

This particular point from Tom about numeric promotion and casting
hierarchy resonates as a very good one:
https://www.postgresql.org/message-id/3180774.1733588677%40sss.pgh.pa.us
My own bet is that if you don't want to lose any existing
functionality, perhaps we should be just more aggressive with casting
requirements on input to begin with even if it means more work when
writing queries, even if it comes at a loss of usability that should
still be something..

FWIW, I've wanted an unsigned in-core type more than once.  It would
be a lot of work, but we have a lot of things that exist in the core
code that map to unsigned 32b or 64b integer values.  Just naming one,
WAL LSN difference calculations.  XLogRecPtr is a 64b unsigned value,
not its representation at SQL level, meaning that we'd overflow after
reaching the threshold of the bit tracking the signedness.  It's true
that a system would need to live long enough to reach such LSNs, but
we have also 32b integers plugged here and there.  Another one is
block numbers, or OID which is an in-core type.  Having an in-core
unsigned integer type would scale better that creating types mapping
to every single internal core concept.
--
Michael

Вложения

signature.asc

Re: Add comment explaining why queryid is int64 in pg_stat_statements

От

Michael Paquier

Дата:

21 мая 2025 г., 07:43:11

On Tue, May 20, 2025 at 10:35:39AM -0500, Sami Imseih wrote:
> 1/ this was missed in pg_stat_statements
> ./contrib/pg_stat_statements/pg_stat_statements.c:    uint64
> saved_queryId = pstmt->queryId;

Right.  Some greps based on "queryid" and "query_id" point that this
is the last reference to uint in the tree.

> 2/ Should "DatumGetInt64(hash_any_extended(......" be turned into a
> macro since it's used in
> several places? Also, we can add a comment in the macro such as
> "
> Output the queryId as an int64 rather than a uint64, to match the
> BIGINT column used to emit
> queryId in system views
> "

We are talking about two places in queryjumblefuncs.c.  Leaving the
code as in David's original proposal is fine IMO.  Adding a comment
about the implied uint64 -> int64 conversion when calling
hash_any_extended() sounds like a good idea to document what we want
from the hash.

I've had a closer look at David's patch, and I did not spot other
places that required similar adjustments.  I may have missed
something, of course.

David, how would you like to move on with this patch set?  I can take
responsibility for both patches as the plan ID change can qualify as
an open item.
--
Michael

Вложения

Re: Add comment explaining why queryid is int64 in pg_stat_statements

От

Julien Rouhaud

Дата:

21 мая 2025 г., 15:19:56

On Wed, May 21, 2025 at 09:34:02AM +0900, Michael Paquier wrote:
> On Tue, May 20, 2025 at 11:28:17PM +1200, David Rowley wrote:
> > Certainly, a bit late, yes. It basically requires implementing
> > unsigned types on the SQL level. I believe there are a few sunken
> > ships along that coastline, and plenty of history in the archives if
> > you want to understand some of the difficulties.
>
> Providing some more context and some more information than this reply,
> the latest thread that I can find on the matter is this one, from
> December 2024:
> https://www.postgresql.org/message-id/CAN3gO4sPBKbfWYK10i294u3kzsfDb4WX891FMbjLnKjMS08u7A%40mail.gmail.com
>
> It summarizes nicely the situation and it contains some more general
> points.
>
> This particular point from Tom about numeric promotion and casting
> hierarchy resonates as a very good one:
> https://www.postgresql.org/message-id/3180774.1733588677%40sss.pgh.pa.us
> My own bet is that if you don't want to lose any existing
> functionality, perhaps we should be just more aggressive with casting
> requirements on input to begin with even if it means more work when
> writing queries, even if it comes at a loss of usability that should
> still be something..

Thanks a lot Michael!  I actually read that thread at that time but forgot
about it until you sent this link.

> FWIW, I've wanted an unsigned in-core type more than once.  It would
> be a lot of work, but we have a lot of things that exist in the core
> code that map to unsigned 32b or 64b integer values.  Just naming one,
> WAL LSN difference calculations.  XLogRecPtr is a 64b unsigned value,
> not its representation at SQL level, meaning that we'd overflow after
> reaching the threshold of the bit tracking the signedness.  It's true
> that a system would need to live long enough to reach such LSNs, but
> we have also 32b integers plugged here and there.  Another one is
> block numbers, or OID which is an in-core type.  Having an in-core
> unsigned integer type would scale better that creating types mapping
> to every single internal core concept.

Agreed.

Re: Add comment explaining why queryid is int64 in pg_stat_statements

От

Peter Eisentraut

Дата:

21 мая 2025 г., 17:58:09

On 20.05.25 08:38, Michael Paquier wrote:
> On Tue, May 20, 2025 at 05:51:51PM +1200, David Rowley wrote:
>> Given the planId stuff is new and has the same issue, I think that
>> pushes me towards thinking now is better than later for fixing both.
>>
>> I'm happy to adjust my patch unless you've started working on it already.
> 
> Here you go with the attached, to-be-applied on top of your own patch.

Whichever way we're going, surely this whole thing could benefit from a

     typedef something QueryId;

Re: Add comment explaining why queryid is int64 in pg_stat_statements

От

David Rowley

Дата:

22 мая 2025 г., 05:36:38

On Thu, 22 May 2025 at 02:58, Peter Eisentraut <peter@eisentraut.org> wrote:
>
> On 20.05.25 08:38, Michael Paquier wrote:
> > On Tue, May 20, 2025 at 05:51:51PM +1200, David Rowley wrote:
> >> Given the planId stuff is new and has the same issue, I think that
> >> pushes me towards thinking now is better than later for fixing both.
> >>
> >> I'm happy to adjust my patch unless you've started working on it already.
> >
> > Here you go with the attached, to-be-applied on top of your own patch.
>
> Whichever way we're going, surely this whole thing could benefit from a
>
>      typedef something QueryId;

This part I'm not so sure about. It's not as if it can be changed to
some other type with a simple definition of the typedef.  Looking at
the patch I proposed there are quite a few things that would still
need to be updated after the typedef is changed. PG_GETARG_INT64(),
INT64CONST() and DatumGetInt64() are part of that. There's also the
return type of hash_any_extended().  Changing the typedef definition
might also need an adjustment in gen_node_support.pl, depending on
what you're changing it to.

You could argue that if it reduces the locations that need to be
changed by using a typedef, then it's a win. But there are also
negative aspects to typedefs that need to be considered. For me, those
are the added level of indirection of code reading to actually who
what type I'm working with. I personally dislike typedefs like
"typedef PageHeaderData *PageHeader;" as they hide the fact I'm
working with a pointer.

I'm not outright objecting to the typedef for this. It's just I don't
see it as a clear-cut improvement for this case.

David

Re: Add comment explaining why queryid is int64 in pg_stat_statements

От

Michael Paquier

Дата:

22 мая 2025 г., 07:01:14

On Thu, May 22, 2025 at 02:36:38PM +1200, David Rowley wrote:
> You could argue that if it reduces the locations that need to be
> changed by using a typedef, then it's a win. But there are also
> negative aspects to typedefs that need to be considered. For me, those
> are the added level of indirection of code reading to actually who
> what type I'm working with. I personally dislike typedefs like
> "typedef PageHeaderData *PageHeader;" as they hide the fact I'm
> working with a pointer.
>
> I'm not outright objecting to the typedef for this. It's just I don't
> see it as a clear-cut improvement for this case.

Same opinion here.  I am not quite clear what there is to gain in
hiding the query ID behind a typedef, or even apply that to the plan
ID.

I have added an open item about the plan ID part as it applies to v18,
adding the RMT in CC to get an opinion.  If we cannot get a consensus
on all that, letting things as they are is still logically correct,
even with the -Wwarnings-format-signedness argument which is not
included by default currently.

Has somebody an opinion to offer?
--
Michael

Вложения

signature.asc

Re: Add comment explaining why queryid is int64 in pg_stat_statements

От

Nathan Bossart

Дата:

29 мая 2025 г., 17:28:35

On Thu, May 29, 2025 at 01:53:07PM +0900, Michael Paquier wrote:
> On Thu, May 22, 2025 at 01:01:14PM +0900, Michael Paquier wrote:
>> I have added an open item about the plan ID part as it applies to v18,
>> adding the RMT in CC to get an opinion.  If we cannot get a consensus
>> on all that, letting things as they are is still logically correct,
>> even with the -Wwarnings-format-signedness argument which is not
>> included by default currently.
>> 
>> Has somebody an opinion to offer?
> 
> It has been one week since this last update, and there has been
> nothing except the sound of cicadas.  IMO, I think that we should just
> pull the switch and make both of these IDs signed on HEAD, taking case
> of the potential signedness warning issues.
> 
> Now, I don't really want to take a leap of faith without the RMT being
> OK with that now that we are in beta1.

After reading through this thread and the latest patch set, I don't see any
strong reason for the RMT to object to this change for v18.  IIUC some
extensions may need to adapt, but we're still a few months from 18.0, so
that seems okay.  I vaguely recall that we've made other small
extension-breaking changes during the beta period for previous major
releases.

-- 
nathan

Re: Add comment explaining why queryid is int64 in pg_stat_statements

От

Michael Paquier

Дата:

30 мая 2025 г., 02:34:49

On Thu, May 29, 2025 at 09:28:35AM -0500, Nathan Bossart wrote:
> On Thu, May 29, 2025 at 01:53:07PM +0900, Michael Paquier wrote:
>> Now, I don't really want to take a leap of faith without the RMT being
>> OK with that now that we are in beta1.
>
> After reading through this thread and the latest patch set, I don't see any
> strong reason for the RMT to object to this change for v18.  IIUC some
> extensions may need to adapt, but we're still a few months from 18.0, so
> that seems okay.  I vaguely recall that we've made other small
> extension-breaking changes during the beta period for previous major
> releases.

Thanks, Nathan.  Let's proceed with the change then.  David, would you
prefer handling the patch you have written by yourself for the query
ID part?
--
Michael

Вложения

signature.asc

Re: Add comment explaining why queryid is int64 in pg_stat_statements

От

David Rowley

Дата:

30 мая 2025 г., 02:51:02

On Fri, 30 May 2025 at 11:35, Michael Paquier <michael@paquier.xyz> wrote:
> Thanks, Nathan.  Let's proceed with the change then.  David, would you
> prefer handling the patch you have written by yourself for the query
> ID part?

Yes. I'll look at that again shortly.

David

Re: Add comment explaining why queryid is int64 in pg_stat_statements

От

David Rowley

Дата:

30 мая 2025 г., 14:08:58

Thanks for the review.

On Wed, 21 May 2025 at 03:35, Sami Imseih <samimseih@gmail.com> wrote:
> 2/ Should "DatumGetInt64(hash_any_extended(......" be turned into a
> macro since it's used in
> several places? Also, we can add a comment in the macro such as
> "
> Output the queryId as an int64 rather than a uint64, to match the
> BIGINT column used to emit
> queryId in system views
> "
>
> I think this comment is needed to address the confusion that is the
> original subject of this thread. Otherwise,
> the confusion will be moved from pg_stat_statements.c to queryjumblefuncs.c

I ended up adding a comment in the Query struct detailing why queryId
is signed. I imagine, as time passes, we might forget why we did that
as hashed values are more naturally unsigned. I think it's wrong to
add comments in DoJumble() to mention details about what the calling
function does. I think the fact that DoJumble() now returns int64
should be a good enough explanation as to why we want to get the hash
value in signed form.

David

Re: Add comment explaining why queryid is int64 in pg_stat_statements

От

David Rowley

Дата:

30 мая 2025 г., 14:09:23

On Fri, 30 May 2025 at 11:51, David Rowley <dgrowleyml@gmail.com> wrote:
>
> On Fri, 30 May 2025 at 11:35, Michael Paquier <michael@paquier.xyz> wrote:
> > Thanks, Nathan.  Let's proceed with the change then.  David, would you
> > prefer handling the patch you have written by yourself for the query
> > ID part?
>
> Yes. I'll look at that again shortly.

Now done.

David

Re: Add comment explaining why queryid is int64 in pg_stat_statements

От

Shaik Mohammad Mujeeb

Дата:

30 мая 2025 г., 14:34:10

Thanks for making the appropriate changes, David.

Now everything make sense. :)

---- On Fri, 30 May 2025 16:39:23 +0530 David Rowley <dgrowleyml@gmail.com> wrote ---

On Fri, 30 May 2025 at 11:51, David Rowley <dgrowleyml@gmail.com> wrote:
>
> On Fri, 30 May 2025 at 11:35, Michael Paquier <michael@paquier.xyz> wrote:
> > Thanks, Nathan. Let's proceed with the change then. David, would you
> > prefer handling the patch you have written by yourself for the query
> > ID part?
>
> Yes. I'll look at that again shortly.

Now done.

David

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Re: Add comment explaining why queryid is int64 in pg_stat_statements

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения