Обсуждение: Re: Add comment explaining why queryid is int64 in pg_stat_statements
On 15.05.2025 10:08, Shaik Mohammad Mujeeb wrote:
Hi Developers,
In pg_stat_statements.c, the function pg_stat_statements_internal() declares the queryid variable as int64, but assigns it a value of type uint64. At first glance, this might appear to be an overflow issue. However, I think this is intentional - the queryid is cast to int64 to match the bigint type of the queryid column in the pg_stat_statements view.
Please find the attached patch, which adds a clarifying comment to make the rationale explicit and avoid potential confusion for future readers.Thanks and Regards,
Shaik Mohammad Mujeeb
Member Technical Staff
Zoho Corp
I don't think the comment is necessary here. There are no arithmetic or logical operations performed on it. It is only passed as a Datum.
--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC.
Re: Add comment explaining why queryid is int64 in pg_stat_statements
От
Shaik Mohammad Mujeeb
Дата:
Hi Ilia Evdokimov,
While it's true that no arithmetic or logical operations are performed on
This conversion is intentional - most likely to match the bigint type of the queryid column in pg_stat_statements. However, without an explicit comment, this can be misleading. A beginner reading this might misinterpret it as an unintentional overflow or bug and raise unnecessary concerns. Therefore, it’s worth adding a brief comment clarifying the intent behind this assignment.
Thanks & Regards,
Shaik Mohammad Mujeeb
While it's true that no arithmetic or logical operations are performed on
queryid
after the assignment, the overflow technically occurs at the point of assignment itself. For example, entry->key.queryid
holds the value 12747288675711951805
as a uint64
, but after assigning it to queryid
(which is an int64
), it becomes -5699455397997599811
due to overflow.This conversion is intentional - most likely to match the bigint type of the queryid column in pg_stat_statements. However, without an explicit comment, this can be misleading. A beginner reading this might misinterpret it as an unintentional overflow or bug and raise unnecessary concerns. Therefore, it’s worth adding a brief comment clarifying the intent behind this assignment.
Thanks & Regards,
Shaik Mohammad Mujeeb
Member Technical Staff
Zoho Corp
---- On Fri, 16 May 2025 15:12:41 +0530 Ilia Evdokimov <ilya.evdokimov@tantorlabs.com> wrote ---
On 15.05.2025 10:08, Shaik Mohammad Mujeeb wrote:
I don't think the comment is necessary here. There are no arithmetic or logical operations performed on it. It is only passed as a Datum.
--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC.Hi Developers,
In pg_stat_statements.c, the function pg_stat_statements_internal() declares the queryid variable as int64, but assigns it a value of type uint64. At first glance, this might appear to be an overflow issue. However, I think this is intentional - the queryid is cast to int64 to match the bigint type of the queryid column in the pg_stat_statements view.
Please find the attached patch, which adds a clarifying comment to make the rationale explicit and avoid potential confusion for future readers.Thanks and Regards,
Shaik Mohammad Mujeeb
Member Technical Staff
Zoho Corp
Hi Shaik
> While it's true that no arithmetic or logical operations are performed on
queryid
after the assignment, the overflow technically > occurs at the point of assignment itself. For example, entry->key.queryid
holds the value 12747288675711951805
as a>
> This conversion is intentional - most likely to match the bigint type of the queryid column in pg_stat_statements. However,
uint64
, but after assigning it to queryid
(which is an int64
), it becomes -5699455397997599811
due to overflow.> This conversion is intentional - most likely to match the bigint type of the queryid column in pg_stat_statements. However,
> without an explicit comment, this can be misleading. A beginner reading this might misinterpret it as an unintentional overflow > or bug and raise unnecessary concerns. Therefore, it’s worth adding a brief comment clarifying the intent behind this
> assignment.
+1 agree
On Sat, May 17, 2025 at 1:54 AM Shaik Mohammad Mujeeb <mujeeb.sk@zohocorp.com> wrote:
Hi Ilia Evdokimov,
While it's true that no arithmetic or logical operations are performed onqueryid
after the assignment, the overflow technically occurs at the point of assignment itself. For example,entry->key.queryid
holds the value12747288675711951805
as auint64
, but after assigning it toqueryid
(which is anint64
), it becomes-5699455397997599811
due to overflow.
This conversion is intentional - most likely to match the bigint type of the queryid column in pg_stat_statements. However, without an explicit comment, this can be misleading. A beginner reading this might misinterpret it as an unintentional overflow or bug and raise unnecessary concerns. Therefore, it’s worth adding a brief comment clarifying the intent behind this assignment.
Thanks & Regards,
Shaik Mohammad MujeebMember Technical StaffZoho Corp---- On Fri, 16 May 2025 15:12:41 +0530 Ilia Evdokimov <ilya.evdokimov@tantorlabs.com> wrote ---
On 15.05.2025 10:08, Shaik Mohammad Mujeeb wrote:
I don't think the comment is necessary here. There are no arithmetic or logical operations performed on it. It is only passed as a Datum.
--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC.Hi Developers,
In pg_stat_statements.c, the function pg_stat_statements_internal() declares the queryid variable as int64, but assigns it a value of type uint64. At first glance, this might appear to be an overflow issue. However, I think this is intentional - the queryid is cast to int64 to match the bigint type of the queryid column in the pg_stat_statements view.
Please find the attached patch, which adds a clarifying comment to make the rationale explicit and avoid potential confusion for future readers.Thanks and Regards,
Shaik Mohammad Mujeeb
Member Technical Staff
Zoho Corp
Re: Add comment explaining why queryid is int64 in pg_stat_statements
От
Shaik Mohammad Mujeeb
Дата:
Hi Michael Paquier,
> I don't quite see the value in the comment addition you are suggesting
> I don't quite see the value in the comment addition you are suggesting
> here: all the user-facing features related to query IDs use signed 64b
> integers, and are documented as such in the official docs. The fact
> that we store an unsigned value in the backend core code is an
> internal implementation artifact, the important point is that we have
> a value stored in 8 bytes.
Originally, queryId used to be of type uint32, which fit into int64 without any overflow. However, after commit cff440d36 (which changed queryId to uint64), overflow started happening when assigning it to an int64. In that commit, queryId was updated to uint64 in most places - except in pg_stat_statements_internal(). Given that this was an intentional choice, I feel a brief comment should have been included in that commit itself to clarify the reasoning.
Although the commit was made back in 2017, this discrepancy still causes confusion. I found myself revisiting and analyzing it simply because there was no explanatory comment. Others might raise the same question again in the future unless this is clarified explicitly.
I also noticed that a similar comment is already present in
ExplainPrintPlan():
/* * Output the queryid as an int64 rather than a uint64 so we match * what would be seen in the BIGINT pg_stat_statements.queryid column. */ |
Since such a comment exists there, I believe it would be equally reasonable and helpful to include a similar one in
pg_stat_statements_internal()
as well.Of course, if this still seems unnecessary, I’m happy to drop the suggestion - just wanted to put it forward in case it helps reduce ambiguity for future readers or contributors.
Thanks and Regards,
Shaik Mohammad Mujeeb
Member Technical Staff
Zoho Corp
---- On Sat, 17 May 2025 18:19:23 +0530 Michael Paquier <michael@paquier.xyz> wrote ---
On Fri, May 16, 2025 at 04:05:01PM +0530, Shaik Mohammad Mujeeb wrote:
> This conversion is intentional - most likely to match the bigint
> type of the queryid column in pg_stat_statements. However, without
> an explicit comment, this can be misleading. A beginner reading this
> might misinterpret it as an unintentional overflow or bug and raise
> unnecessary concerns. Therefore, it´s worth adding a brief comment
> clarifying the intent behind this assignment.
I don't quite see the value in the comment addition you are suggesting
here: all the user-facing features related to query IDs use signed 64b
integers, and are documented as such in the official docs. The fact
that we store an unsigned value in the backend core code is an
internal implementation artifact, the important point is that we have
a value stored in 8 bytes.
--
Michael
On Sun, May 18, 2025 at 3:48 AM Shaik Mohammad Mujeeb <mujeeb.sk@zohocorp.com> wrote:
Hi Michael Paquier,
> I don't quite see the value in the comment addition you are suggesting> here: all the user-facing features related to query IDs use signed 64b> integers, and are documented as such in the official docs. The fact> that we store an unsigned value in the backend core code is an> internal implementation artifact, the important point is that we have> a value stored in 8 bytes.Originally, queryId used to be of type uint32, which fit into int64 without any overflow. However, after commit cff440d36 (which changed queryId to uint64), overflow started happening when assigning it to an int64. In that commit, queryId was updated to uint64 in most places - except in pg_stat_statements_internal(). Given that this was an intentional choice, I feel a brief comment should have been included in that commit itself to clarify the reasoning.
from uint32 to uint64 significantly reduces the chances of two different
queries generating the same queryId.
Since queryId is not incremented sequentially, the overflow issue you referenced
should not be a concern. Furthermore, converting between int64 and uint64
should not be a concern. Furthermore, converting between int64 and uint64
does not result in any loss of information.
Although the commit was made back in 2017, this discrepancy still causes confusion. I found myself revisiting and analyzing it simply because there was no explanatory comment. Others might raise the same question again in the future unless this is clarified explicitly.
I also noticed that a similar comment is already present inExplainPrintPlan():
/** Output the queryid as an int64 rather than a uint64 so we match* what would be seen in the BIGINT pg_stat_statements.queryid column.*/
Since such a comment exists there, I believe it would be equally reasonable and helpful to include a similar one inpg_stat_statements_internal()
as well.
Of course, if this still seems unnecessary, I’m happy to drop the suggestion - just wanted to put it forward in case it helps reduce ambiguity for future readers or contributors.
Thanks and Regards,Shaik Mohammad MujeebMember Technical StaffZoho Corp---- On Sat, 17 May 2025 18:19:23 +0530 Michael Paquier <michael@paquier.xyz> wrote ---On Fri, May 16, 2025 at 04:05:01PM +0530, Shaik Mohammad Mujeeb wrote:
> This conversion is intentional - most likely to match the bigint
> type of the queryid column in pg_stat_statements. However, without
> an explicit comment, this can be misleading. A beginner reading this
> might misinterpret it as an unintentional overflow or bug and raise
> unnecessary concerns. Therefore, it´s worth adding a brief comment
> clarifying the intent behind this assignment.
I don't quite see the value in the comment addition you are suggesting
here: all the user-facing features related to query IDs use signed 64b
integers, and are documented as such in the official docs. The fact
that we store an unsigned value in the backend core code is an
internal implementation artifact, the important point is that we have
a value stored in 8 bytes.
--
Michael
Regards
Junwang Zhao
On 17.05.25 14:49, Michael Paquier wrote: > On Fri, May 16, 2025 at 04:05:01PM +0530, Shaik Mohammad Mujeeb wrote: >> This conversion is intentional - most likely to match the bigint >> type of the queryid column in pg_stat_statements. However, without >> an explicit comment, this can be misleading. A beginner reading this >> might misinterpret it as an unintentional overflow or bug and raise >> unnecessary concerns. Therefore, it´s worth adding a brief comment >> clarifying the intent behind this assignment. > > I don't quite see the value in the comment addition you are suggesting > here: all the user-facing features related to query IDs use signed 64b > integers, and are documented as such in the official docs. The fact > that we store an unsigned value in the backend core code is an > internal implementation artifact, the important point is that we have > a value stored in 8 bytes. There is another, new instance of this confusion. When query IDs are printed to the log, we use a signed-integer format, but the value is unsigned (see log_status_format(), write_jsonlog()). This works correctly in practice, but it triggers warnings from -Wwarnings-format-signedness (not in the default warnings set, but useful to clean up occasionally) and maybe other tools along those lines. This is new because in the past this warning was hidden by additional casts. If we want to keep this arrangement, maybe we should create an explicit function to convert query IDs for output, and then explain all this there. Or why not store query IDs as int64_t internally, too?
FWIW, all the hash function SQL interfaces, \df hash*, have this behavior in which the result is a signed (int/bigint), but the internal representation of the hash is an unsigned (int/bigint). I am not sure why a comment is needed specifically for pg_stat_statements -- Sami Imseih Amazon Web Services (AWS)
On Tue, May 20, 2025 at 02:03:37PM +1200, David Rowley wrote: > Aside from the struct field types changing for Query.queryId, > PlannedStmt.queryId and PgBackendStatus.st_query_id, the > external-facing changes are limited to: > > 1. pgstat_report_query_id() now returns int64 instead of uint64 > 2. pgstat_get_my_query_id() now returns int64 instead of uint64 > 3. pgstat_report_query_id()'s first input parameter is now int64 > > If we were to clean this up, then I expect it's fine to wait until > v19, as it's not really a problem that's new to v18. Hmm. For the query ID, that's not new, but for the plan ID it is. So it seems to me that there could be also an argument for doing all that in v18 rather than releasing v18 with the plan ID being unsigned, keeping a maximum of consistency for both of IDs. AFAIK, this is something that Lukas's plan storing extension exposes as an int64 value to the user and the SQL interfaces, even if it's true that we don't expose it in core, yet. -- Michael
Вложения
On Mon, May 19, 2025 at 8:12 PM Michael Paquier <michael@paquier.xyz> wrote:
On Tue, May 20, 2025 at 02:03:37PM +1200, David Rowley wrote:
> Aside from the struct field types changing for Query.queryId,
> PlannedStmt.queryId and PgBackendStatus.st_query_id, the
> external-facing changes are limited to:
>
> 1. pgstat_report_query_id() now returns int64 instead of uint64
> 2. pgstat_get_my_query_id() now returns int64 instead of uint64
> 3. pgstat_report_query_id()'s first input parameter is now int64
>
> If we were to clean this up, then I expect it's fine to wait until
> v19, as it's not really a problem that's new to v18.
Hmm. For the query ID, that's not new, but for the plan ID it is. So
it seems to me that there could be also an argument for doing all that
in v18 rather than releasing v18 with the plan ID being unsigned,
keeping a maximum of consistency for both of IDs. AFAIK, this is
something that Lukas's plan storing extension exposes as an int64
value to the user and the SQL interfaces, even if it's true that we
don't expose it in core, yet.
Yeah, +1 to making this consistent across both query ID and the new plan ID, and changing both to int64 internally seems best.
Just as an example, in my current work-in-progress version of a new pg_stat_plans extension the plan ID gets set like this ([0]):
static void
pgsp_calculate_plan_id(PlannedStmt *result)
{
...
result->planId = HashJumbleState(jstate);
...
}
pgsp_calculate_plan_id(PlannedStmt *result)
{
...
result->planId = HashJumbleState(jstate);
...
}
With HashJumbleState currently returning a uint64.
Similarly, when reading out the planID from backends, we do:
values[4] = UInt64GetDatum(beentry->st_plan_id);
if Postgres 18 released as-is, and 19 changed this, we'd have to carry a version-specific cast from uint64 to int64 in both places. Not a big deal, but certainly nice to not knowingly introduce this confusion for extension development.
As an additional data point, the existing pg_stat_monitor extension uses a uint64 for planID [1], and pg_store_plans uses a uint32 [2]. I'd expect both of these to update to a int64 internally with the new planID being available in 18.
Thanks,
Lukas
-- Lukas Fittl
On Mon, May 19, 2025 at 08:43:25PM -0700, Lukas Fittl wrote: > Yeah, +1 to making this consistent across both query ID and the new plan > ID, and changing both to int64 internally seems best. Any thoughts from others? I'm OK to take the extra step of switching both fields on HEAD and write a patch for that, relying on what David has sent as a first step towards that. -- Michael
Вложения
On Tue, 20 May 2025 at 17:09, Michael Paquier <michael@paquier.xyz> wrote: > > On Mon, May 19, 2025 at 08:43:25PM -0700, Lukas Fittl wrote: > > Yeah, +1 to making this consistent across both query ID and the new plan > > ID, and changing both to int64 internally seems best. > > Any thoughts from others? I'm OK to take the extra step of switching > both fields on HEAD and write a patch for that, relying on what David > has sent as a first step towards that. Given the planId stuff is new and has the same issue, I think that pushes me towards thinking now is better than later for fixing both. I'm happy to adjust my patch unless you've started working on it already. David
On Tue, May 20, 2025 at 02:09:13PM +0900, Michael Paquier wrote: > On Mon, May 19, 2025 at 08:43:25PM -0700, Lukas Fittl wrote: > > Yeah, +1 to making this consistent across both query ID and the new plan > > ID, and changing both to int64 internally seems best. > > Any thoughts from others? I'm OK to take the extra step of switching > both fields on HEAD and write a patch for that, relying on what David > has sent as a first step towards that. I don't think it was discussed, but why not go the other way, keep everything as uint64 and expose an uint64 datatype at the SQL level instead? That's something I actually want pretty often so I would be happy to have that natively, whether it's called bigoid, oid8 or anything else. That's probably too late for pg18 though.
On Tue, 20 May 2025 at 18:12, Julien Rouhaud <rjuju123@gmail.com> wrote: > I don't think it was discussed, but why not go the other way, keep everything > as uint64 and expose an uint64 datatype at the SQL level instead? That's > something I actually want pretty often so I would be happy to have that > natively, whether it's called bigoid, oid8 or anything else. That's probably > too late for pg18 though. Certainly, a bit late, yes. It basically requires implementing unsigned types on the SQL level. I believe there are a few sunken ships along that coastline, and plenty of history in the archives if you want to understand some of the difficulties. David
Supporting unsigned INTs will be valuable for more than just this obviously, and if we do ever have that capability, we would likely want to go that route. I know I've been asked about why queryIds could be negative more than a few times. Until then, I think the patch being suggested is reasonable and cleaner. A few comments on the patches: 1/ this was missed in pg_stat_statements ./contrib/pg_stat_statements/pg_stat_statements.c: uint64 saved_queryId = pstmt->queryId; 2/ Should "DatumGetInt64(hash_any_extended(......" be turned into a macro since it's used in several places? Also, we can add a comment in the macro such as " Output the queryId as an int64 rather than a uint64, to match the BIGINT column used to emit queryId in system views " I think this comment is needed to address the confusion that is the original subject of this thread. Otherwise, the confusion will be moved from pg_stat_statements.c to queryjumblefuncs.c DoJumble(JumbleState *jstate, Node *node) { /* Jumble the given node */ @@ -208,9 +208,9 @@ DoJumble(JumbleState *jstate, Node *node) FlushPendingNulls(jstate); /* Process the jumble buffer and produce the hash value */ - return DatumGetUInt64(hash_any_extended(jstate->jumble, - jstate->jumble_len, - 0)); + return DatumGetInt64(hash_any_extended(jstate->jumble, + jstate->jumble_len, + 0)); } /* @@ -256,10 +256,10 @@ AppendJumbleInternal(JumbleState *jstate, const unsigned char *item, if (unlikely(jumble_len >= JUMBLE_SIZE)) { - uint64 start_hash; + int64 start_hash; - start_hash = DatumGetUInt64(hash_any_extended(jumble, - JUMBLE_SIZE, 0)); + start_hash = DatumGetInt64(hash_any_extended(jumble, + JUMBLE_SIZE, 0)); memcpy(jumble, &start_hash, sizeof(start_hash)); jumble_len = sizeof(start_hash); -- Sami Imseih Amazon Web Services (AWS)
On Tue, May 20, 2025 at 11:28:17PM +1200, David Rowley wrote: > Certainly, a bit late, yes. It basically requires implementing > unsigned types on the SQL level. I believe there are a few sunken > ships along that coastline, and plenty of history in the archives if > you want to understand some of the difficulties. Providing some more context and some more information than this reply, the latest thread that I can find on the matter is this one, from December 2024: https://www.postgresql.org/message-id/CAN3gO4sPBKbfWYK10i294u3kzsfDb4WX891FMbjLnKjMS08u7A%40mail.gmail.com It summarizes nicely the situation and it contains some more general points. This particular point from Tom about numeric promotion and casting hierarchy resonates as a very good one: https://www.postgresql.org/message-id/3180774.1733588677%40sss.pgh.pa.us My own bet is that if you don't want to lose any existing functionality, perhaps we should be just more aggressive with casting requirements on input to begin with even if it means more work when writing queries, even if it comes at a loss of usability that should still be something.. FWIW, I've wanted an unsigned in-core type more than once. It would be a lot of work, but we have a lot of things that exist in the core code that map to unsigned 32b or 64b integer values. Just naming one, WAL LSN difference calculations. XLogRecPtr is a 64b unsigned value, not its representation at SQL level, meaning that we'd overflow after reaching the threshold of the bit tracking the signedness. It's true that a system would need to live long enough to reach such LSNs, but we have also 32b integers plugged here and there. Another one is block numbers, or OID which is an in-core type. Having an in-core unsigned integer type would scale better that creating types mapping to every single internal core concept. -- Michael
Вложения
On Tue, May 20, 2025 at 10:35:39AM -0500, Sami Imseih wrote: > 1/ this was missed in pg_stat_statements > ./contrib/pg_stat_statements/pg_stat_statements.c: uint64 > saved_queryId = pstmt->queryId; Right. Some greps based on "queryid" and "query_id" point that this is the last reference to uint in the tree. > 2/ Should "DatumGetInt64(hash_any_extended(......" be turned into a > macro since it's used in > several places? Also, we can add a comment in the macro such as > " > Output the queryId as an int64 rather than a uint64, to match the > BIGINT column used to emit > queryId in system views > " We are talking about two places in queryjumblefuncs.c. Leaving the code as in David's original proposal is fine IMO. Adding a comment about the implied uint64 -> int64 conversion when calling hash_any_extended() sounds like a good idea to document what we want from the hash. I've had a closer look at David's patch, and I did not spot other places that required similar adjustments. I may have missed something, of course. David, how would you like to move on with this patch set? I can take responsibility for both patches as the plan ID change can qualify as an open item. -- Michael
Вложения
On Wed, May 21, 2025 at 09:34:02AM +0900, Michael Paquier wrote: > On Tue, May 20, 2025 at 11:28:17PM +1200, David Rowley wrote: > > Certainly, a bit late, yes. It basically requires implementing > > unsigned types on the SQL level. I believe there are a few sunken > > ships along that coastline, and plenty of history in the archives if > > you want to understand some of the difficulties. > > Providing some more context and some more information than this reply, > the latest thread that I can find on the matter is this one, from > December 2024: > https://www.postgresql.org/message-id/CAN3gO4sPBKbfWYK10i294u3kzsfDb4WX891FMbjLnKjMS08u7A%40mail.gmail.com > > It summarizes nicely the situation and it contains some more general > points. > > This particular point from Tom about numeric promotion and casting > hierarchy resonates as a very good one: > https://www.postgresql.org/message-id/3180774.1733588677%40sss.pgh.pa.us > My own bet is that if you don't want to lose any existing > functionality, perhaps we should be just more aggressive with casting > requirements on input to begin with even if it means more work when > writing queries, even if it comes at a loss of usability that should > still be something.. Thanks a lot Michael! I actually read that thread at that time but forgot about it until you sent this link. > FWIW, I've wanted an unsigned in-core type more than once. It would > be a lot of work, but we have a lot of things that exist in the core > code that map to unsigned 32b or 64b integer values. Just naming one, > WAL LSN difference calculations. XLogRecPtr is a 64b unsigned value, > not its representation at SQL level, meaning that we'd overflow after > reaching the threshold of the bit tracking the signedness. It's true > that a system would need to live long enough to reach such LSNs, but > we have also 32b integers plugged here and there. Another one is > block numbers, or OID which is an in-core type. Having an in-core > unsigned integer type would scale better that creating types mapping > to every single internal core concept. Agreed.
On 20.05.25 08:38, Michael Paquier wrote: > On Tue, May 20, 2025 at 05:51:51PM +1200, David Rowley wrote: >> Given the planId stuff is new and has the same issue, I think that >> pushes me towards thinking now is better than later for fixing both. >> >> I'm happy to adjust my patch unless you've started working on it already. > > Here you go with the attached, to-be-applied on top of your own patch. Whichever way we're going, surely this whole thing could benefit from a typedef something QueryId;
On Thu, 22 May 2025 at 02:58, Peter Eisentraut <peter@eisentraut.org> wrote: > > On 20.05.25 08:38, Michael Paquier wrote: > > On Tue, May 20, 2025 at 05:51:51PM +1200, David Rowley wrote: > >> Given the planId stuff is new and has the same issue, I think that > >> pushes me towards thinking now is better than later for fixing both. > >> > >> I'm happy to adjust my patch unless you've started working on it already. > > > > Here you go with the attached, to-be-applied on top of your own patch. > > Whichever way we're going, surely this whole thing could benefit from a > > typedef something QueryId; This part I'm not so sure about. It's not as if it can be changed to some other type with a simple definition of the typedef. Looking at the patch I proposed there are quite a few things that would still need to be updated after the typedef is changed. PG_GETARG_INT64(), INT64CONST() and DatumGetInt64() are part of that. There's also the return type of hash_any_extended(). Changing the typedef definition might also need an adjustment in gen_node_support.pl, depending on what you're changing it to. You could argue that if it reduces the locations that need to be changed by using a typedef, then it's a win. But there are also negative aspects to typedefs that need to be considered. For me, those are the added level of indirection of code reading to actually who what type I'm working with. I personally dislike typedefs like "typedef PageHeaderData *PageHeader;" as they hide the fact I'm working with a pointer. I'm not outright objecting to the typedef for this. It's just I don't see it as a clear-cut improvement for this case. David
On Thu, May 22, 2025 at 02:36:38PM +1200, David Rowley wrote: > You could argue that if it reduces the locations that need to be > changed by using a typedef, then it's a win. But there are also > negative aspects to typedefs that need to be considered. For me, those > are the added level of indirection of code reading to actually who > what type I'm working with. I personally dislike typedefs like > "typedef PageHeaderData *PageHeader;" as they hide the fact I'm > working with a pointer. > > I'm not outright objecting to the typedef for this. It's just I don't > see it as a clear-cut improvement for this case. Same opinion here. I am not quite clear what there is to gain in hiding the query ID behind a typedef, or even apply that to the plan ID. I have added an open item about the plan ID part as it applies to v18, adding the RMT in CC to get an opinion. If we cannot get a consensus on all that, letting things as they are is still logically correct, even with the -Wwarnings-format-signedness argument which is not included by default currently. Has somebody an opinion to offer? -- Michael
Вложения
On Thu, May 29, 2025 at 01:53:07PM +0900, Michael Paquier wrote: > On Thu, May 22, 2025 at 01:01:14PM +0900, Michael Paquier wrote: >> I have added an open item about the plan ID part as it applies to v18, >> adding the RMT in CC to get an opinion. If we cannot get a consensus >> on all that, letting things as they are is still logically correct, >> even with the -Wwarnings-format-signedness argument which is not >> included by default currently. >> >> Has somebody an opinion to offer? > > It has been one week since this last update, and there has been > nothing except the sound of cicadas. IMO, I think that we should just > pull the switch and make both of these IDs signed on HEAD, taking case > of the potential signedness warning issues. > > Now, I don't really want to take a leap of faith without the RMT being > OK with that now that we are in beta1. After reading through this thread and the latest patch set, I don't see any strong reason for the RMT to object to this change for v18. IIUC some extensions may need to adapt, but we're still a few months from 18.0, so that seems okay. I vaguely recall that we've made other small extension-breaking changes during the beta period for previous major releases. -- nathan
On Thu, May 29, 2025 at 09:28:35AM -0500, Nathan Bossart wrote: > On Thu, May 29, 2025 at 01:53:07PM +0900, Michael Paquier wrote: >> Now, I don't really want to take a leap of faith without the RMT being >> OK with that now that we are in beta1. > > After reading through this thread and the latest patch set, I don't see any > strong reason for the RMT to object to this change for v18. IIUC some > extensions may need to adapt, but we're still a few months from 18.0, so > that seems okay. I vaguely recall that we've made other small > extension-breaking changes during the beta period for previous major > releases. Thanks, Nathan. Let's proceed with the change then. David, would you prefer handling the patch you have written by yourself for the query ID part? -- Michael
Вложения
On Fri, 30 May 2025 at 11:35, Michael Paquier <michael@paquier.xyz> wrote: > Thanks, Nathan. Let's proceed with the change then. David, would you > prefer handling the patch you have written by yourself for the query > ID part? Yes. I'll look at that again shortly. David
Thanks for the review. On Wed, 21 May 2025 at 03:35, Sami Imseih <samimseih@gmail.com> wrote: > 2/ Should "DatumGetInt64(hash_any_extended(......" be turned into a > macro since it's used in > several places? Also, we can add a comment in the macro such as > " > Output the queryId as an int64 rather than a uint64, to match the > BIGINT column used to emit > queryId in system views > " > > I think this comment is needed to address the confusion that is the > original subject of this thread. Otherwise, > the confusion will be moved from pg_stat_statements.c to queryjumblefuncs.c I ended up adding a comment in the Query struct detailing why queryId is signed. I imagine, as time passes, we might forget why we did that as hashed values are more naturally unsigned. I think it's wrong to add comments in DoJumble() to mention details about what the calling function does. I think the fact that DoJumble() now returns int64 should be a good enough explanation as to why we want to get the hash value in signed form. David
On Fri, 30 May 2025 at 11:51, David Rowley <dgrowleyml@gmail.com> wrote: > > On Fri, 30 May 2025 at 11:35, Michael Paquier <michael@paquier.xyz> wrote: > > Thanks, Nathan. Let's proceed with the change then. David, would you > > prefer handling the patch you have written by yourself for the query > > ID part? > > Yes. I'll look at that again shortly. Now done. David
Re: Add comment explaining why queryid is int64 in pg_stat_statements
От
Shaik Mohammad Mujeeb
Дата:
Thanks for making the appropriate changes, David.
Now everything make sense. :)
Now everything make sense. :)
---- On Fri, 30 May 2025 16:39:23 +0530 David Rowley <dgrowleyml@gmail.com> wrote ---
On Fri, 30 May 2025 at 11:51, David Rowley <dgrowleyml@gmail.com> wrote:
>
> On Fri, 30 May 2025 at 11:35, Michael Paquier <michael@paquier.xyz> wrote:
> > Thanks, Nathan. Let's proceed with the change then. David, would you
> > prefer handling the patch you have written by yourself for the query
> > ID part?
>
> Yes. I'll look at that again shortly.
Now done.
David