Обсуждение: Bug in pg_stat_statements

Поиск

Список

Период

Сортировка

Bug in pg_stat_statements

От

Konstantin Knizhnik

Дата:

23 октября, 19:21:33

Hi hackers.


In PG18 gthere is bug either in pg_stat_statements.c, either in 
queryjumblefuncs.c.

Repro (you need Postgres build with asserts and pg_stat_statements 
included in shared_preload_librarties:

create function f(a integer[], out x integer, out y integer) returns 
record as $$            begin
        x = a[1];
        y = a[2];
    end;
$$ language plpgsql;

create table t(x integer);
select ((f(array[1,2]))).* from t;

The problems is caused by ((...)).* which cause to include the same 
array literal twice in JumbleQuery.
But pg_stat_stataement assumes that any occurrence of constant is unique 
and it cause assertion failure here:

         /* Copy next chunk (what precedes the next constant) */
         len_to_wrt = off - last_off;
         len_to_wrt -= last_tok_len;
         Assert(len_to_wrt >= 0);

So we should either exclude duplicates in RecordConstLocation 
(queryjumblefuncs.c)
either in generate_normalized_query in pg_stat_statements.c

What is considered to be more correct?

Re: Bug in pg_stat_statements

От

Sami Imseih

Дата:

24 октября, 03:49:55

> In PG18 gthere is bug either in pg_stat_statements.c, either in
> queryjumblefuncs.c.
>
> Repro (you need Postgres build with asserts and pg_stat_statements
> included in shared_preload_librarties:
>
> create function f(a integer[], out x integer, out y integer) returns
> record as $$            begin
>         x = a[1];
>         y = a[2];
>     end;
> $$ language plpgsql;
>
> create table t(x integer);
> select ((f(array[1,2]))).* from t;
>
> The problems is caused by ((...)).* which cause to include the same
> array literal twice in JumbleQuery.
> But pg_stat_stataement assumes that any occurrence of constant is unique
> and it cause assertion failure here:
>
>          /* Copy next chunk (what precedes the next constant) */
>          len_to_wrt = off - last_off;
>          len_to_wrt -= last_tok_len;
>          Assert(len_to_wrt >= 0);
>
> So we should either exclude duplicates in RecordConstLocation
> (queryjumblefuncs.c)

You are correct in the analysis that this is because we are
recording the same constant multiple times, this causes the
len_to_write to be 0 on the second occurance of the duplicated
location. I confirmed that this was introduced in the constant list
squashing per 0f65f3eec.

The same location being recorded twice will occur because of
row expansion, therefore the same location will be jumbled twice,
as observed in the explain output.
```
postgres=# explain verbose select ((f(array[1,2]))).* from t;
                           QUERY PLAN
----------------------------------------------------------------
 Seq Scan on public.t  (cost=0.00..1310.50 rows=2550 width=8)
   Output: (f('{1,2}'::integer[])).x, (f('{1,2}'::integer[])).y
 Query Identifier: 5305868219688930530
(3 rows)

```

Before 0f65f3eec, this was not an issue, because we recorded
all the locations.

I believe the correct answer is to fix this in queryjumblefuncs.c
and more specifically inside _jumbleElements. We should not
record the same location twice. We can do this by tracking the
start location of  the last recorded constant location, and we
can skip recording it if we see it again. See the attached
which also includes test cases.

Also, inside the loop of generate_normalized_query in
pg_stat_statements we are asserting len_to_wrt >= 0, so we
are OK with memcpy of 0 bytes. This is valid, but does
not seem correct.

```
Assert(len_to_wrt > 0);
memcpy(norm_query + n_quer_loc, query + quer_loc, len_to_wrt);
```

Should we continue the loop and skip the memcpy
if (len_to_wrt == 0) instead?

--

Sami Imseih
Amazon Web Services (AWS)

Вложения

0001-Fix-jumbling-of-squashed-lists-with-row-expansion.patch

Re: Bug in pg_stat_statements

От

Dmitry Dolgov

Дата:

24 октября, 16:39:42

> On Thu, Oct 23, 2025 at 07:49:55PM -0500, Sami Imseih wrote:
> I believe the correct answer is to fix this in queryjumblefuncs.c
> and more specifically inside _jumbleElements. We should not
> record the same location twice. We can do this by tracking the
> start location of  the last recorded constant location, and we
> can skip recording it if we see it again. See the attached
> which also includes test cases.

I think there is another option. Before squashing implementation, duplicated
constants were dealt with in fill_in_constant_lengths -- after processing their
length was -1 and such constants were skipped further down the line. After
0f65f3eec, if such a constant is supposed to be squashed we now ignore it, but
set the length anyway in RecordConstLocation:

    if (locs[i].squashed)
        continue;           /* squashable list, ignore */

    if (loc <= last_loc)
        continue;           /* Duplicate constant, ignore */

This sort of short cut the verification for duplicated constants. Having said
that it seems that another solution would be to check for duplicated constants
in fill_in_constant_lengths just before skipping squashed constants and reset
the length if needed.

Re: Bug in pg_stat_statements

От

Sami Imseih

Дата:

24 октября, 17:17:22

> This sort of short cut the verification for duplicated constants. Having said
> that it seems that another solution would be to check for duplicated constants
> in fill_in_constant_lengths just before skipping squashed constants and reset
> the length if needed.

Hi,

yes, I was not convinced my initial idea is good, and I ended up finding
another case that still breaks:

```
SELECT (ROW(ARRAY[1, 2], ARRAY[1, 2, 3])).*;
```

Here we have multiple squashable lists, and simply checking that the
location we are trying to record != the last location does not work. Also,
checking that the location we are trying to record is ahead of the last
location also does not work because the locations of squashable
lists in the predicate will be recorded first. So, we can't rely on the
location we want to record being higher than previous locations.
For example:

```
SELECT (ROW(ARRAY[1, 2], ARRAY[1, 2, 3])).* FROM t WHERE x
IN (1, 2, 3);
```


> Having said
> that it seems that another solution would be to check for duplicated constants
> in fill_in_constant_length

Yes, I started thinking along these lines as well, where we check for
duplicates
in fill_in_constant_length; or after jumbling, we de-duplicate locations if we
have a squashable list, which is what I have attached with updated test cases.

This means we do need to scan the locations one extra time during jumbling,
but I don't see that as a problem. Maybe there is another better way?


--
Sami Imseih
Amazon Web Services (AWS)

Вложения

v2-0001-Fix-jumbling-of-squashed-lists-with-row-expansion.patch

Re: Bug in pg_stat_statements

От

Dmitry Dolgov

Дата:

24 октября, 17:33:41

> On Fri, Oct 24, 2025 at 09:17:22AM -0500, Sami Imseih wrote:
> > Having said
> > that it seems that another solution would be to check for duplicated constants
> > in fill_in_constant_length
> 
> Yes, I started thinking along these lines as well, where we check for
> duplicates
> in fill_in_constant_length; or after jumbling, we de-duplicate locations if we
> have a squashable list, which is what I have attached with updated test cases.
> 
> This means we do need to scan the locations one extra time during jumbling,
> but I don't see that as a problem. Maybe there is another better way?

Why? What I hand in mind is something like this, after a quick test it seems to
be able to address both the original case and the one you've discovered.

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index f2187167..f17a2b79 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -3008,11 +3008,17 @@ fill_in_constant_lengths(JumbleState *jstate, const char *query,

                Assert(loc >= 0);

-               if (locs[i].squashed)
-                       continue;                       /* squashable list, ignore */
-
                if (loc <= last_loc)
+               {
+                       locs[i].length = -1;
                        continue;                       /* Duplicate constant, ignore */
+               }
+
+               if (locs[i].squashed)
+               {
+                       last_loc = loc;
+                       continue;                       /* squashable list, ignore */
+               }

                /* Lex tokens until we find the desired constant */
                for (;;)

Re: Bug in pg_stat_statements

От

Sami Imseih

Дата:

24 октября, 20:55:24

> > > Having said
> > > that it seems that another solution would be to check for duplicated constants
> > > in fill_in_constant_length
> >
> > Yes, I started thinking along these lines as well, where we check for
> > duplicates
> > in fill_in_constant_length; or after jumbling, we de-duplicate locations if we
> > have a squashable list, which is what I have attached with updated test cases.
> >
> > This means we do need to scan the locations one extra time during jumbling,
> > but I don't see that as a problem. Maybe there is another better way?
>
> Why? What I hand in mind is something like this, after a quick test it seems to
> be able to address both the original case and the one you've discovered.

ahh, what you have works because clocations is sorted by location before
the loop starts in `fill_in_constant_lengths` , so comparing locations
makes sense in this case. I am OK with this.

```
    /*
     * Sort the records by location so that we can process them in order while
     * scanning the query text.
     */
    if (jstate->clocations_count > 1)
        qsort(jstate->clocations, jstate->clocations_count,
              sizeof(LocationLen), comp_location);
```

I was really hoping that the fix could be inside of query jumbling, as I think
pg_stat_statements or any consumers of query jumbling don't need to
care more about squashing ( or other internals of jumbling ).
So now I also wonder if we should also move:

```
static char *generate_normalized_query(JumbleState *jstate, const char *query,
int query_loc, int *query_len_p);

static void fill_in_constant_lengths(JumbleState *jstate, const char *query,
int query_loc);
```

from pg_stat_statements.c to queryjumblefuncs.c?


--
Sami Imseih
Amazon Web Services (AWS)

Re: Bug in pg_stat_statements

От

Sami Imseih

Дата:

25 октября, 00:04:20

> > > > Having said
> > > > that it seems that another solution would be to check for duplicated constants
> > > > in fill_in_constant_length
> > >
> > > Yes, I started thinking along these lines as well, where we check for
> > > duplicates
> > > in fill_in_constant_length; or after jumbling, we de-duplicate locations if we
> > > have a squashable list, which is what I have attached with updated test cases.
> > >
> > > This means we do need to scan the locations one extra time during jumbling,
> > > but I don't see that as a problem. Maybe there is another better way?
> >
> > Why? What I hand in mind is something like this, after a quick test it seems to
> > be able to address both the original case and the one you've discovered.
>
> ahh, what you have works because clocations is sorted by location before
> the loop starts in `fill_in_constant_lengths` , so comparing locations
> makes sense in this case. I am OK with this.

v3-0001 does what Dimiti suggests with some slight tweaks:

1/ Moving "last_loc = loc;" earlier in the loop
2/ removing a comment for fill_in_constant_lengths that was
an oversight from 0f65f3eec, as we no longer assume that
the constant location is -1 when entering that routine. Also
added the test cases that were discovered, besides the
one from the original report.

> I was really hoping that the fix could be inside of query jumbling, as I think
> pg_stat_statements or any consumers of query jumbling don't need to
> care more about squashing ( or other internals of jumbling ).
> So now I also wonder if we should also move:
>
> ```
> static char *generate_normalized_query(JumbleState *jstate, const char *query,
> int query_loc, int *query_len_p);
>
> static void fill_in_constant_lengths(JumbleState *jstate, const char *query,
> int query_loc);
> ```
>
> from pg_stat_statements.c to queryjumblefuncs.c?
>
>

v3-0002 moved both generate_normalized_query and fill_in_constant_lengths
to queryjumblefuncs.c, as these routines deal with the internal of jumbling
and should not be a concern of pg_stat_statements. I think this is a good
cleanup, but others may have different opinions on this.

--
Sami Imseih
Amazon Web Services (AWS)

Вложения

Re: Bug in pg_stat_statements

От

Sami Imseih

Дата:

25 октября, 03:04:59

v4 corrects some code comments.

Also, I created a CF entry [0].


[0] https://commitfest.postgresql.org/patch/6167/

Вложения

Re: Bug in pg_stat_statements

От

Dmitry Dolgov

Дата:

26 октября, 14:16:14

> On Fri, Oct 24, 2025 at 07:04:59PM -0500, Sami Imseih wrote:
> v4 corrects some code comments.

The fix in the first patch looks good, thanks. Not so sure about the
refactoring in the second patch -- generate_normalized_query is being
used only in one place, in pg_stat_statements, moving it out somewhere
else without having other clients seems to be questionable to me.

P.S. Adding Álvaro as a commiter of the affected feature, maybe he will
help us to apply the fix.

Re: Bug in pg_stat_statements

От

Sami Imseih

Дата:

27 октября, 22:34:50

Thanks for reviewing!

>Not so sure about the
> refactoring in the second patch -- generate_normalized_query is being
> used only in one place, in pg_stat_statements, moving it out somewhere
> else without having other clients seems to be questionable to me.

The excellent question raised earlier is whether the fix should be
applied in pg_stat_statements.c or queryjumblefuncs.c. To me, this
suggests that pg_stat_statements, as an extension, is dealing with code
it should not be responsible for.

generate_normalized_query could be used by other extensions that have
normalization needs; it’s generic and needs to handle squashing, which
is a jumble specific detail.

Therefore, I think both fill_in_constant_lengths and
generate_normalized_query should be moved. We can certainly start a new
thread (0002) if more discussion is needed.

--
Sami Imseih
Amazon Web Services (AWS)

Re: Bug in pg_stat_statements

От

Michael Paquier

Дата:

28 октября, 06:51:27

On Mon, Oct 27, 2025 at 02:34:50PM -0500, Sami Imseih wrote:
> The excellent question raised earlier is whether the fix should be
> applied in pg_stat_statements.c or queryjumblefuncs.c. To me, this
> suggests that pg_stat_statements, as an extension, is dealing with code
> it should not be responsible for.
>
> generate_normalized_query could be used by other extensions that have
> normalization needs; it’s generic and needs to handle squashing, which
> is a jumble specific detail.

FWIW, the line defining the separation between pg_stat_statements.c
and queryjumblefuncs.c should rely on a pretty simple concept:
JumbleState can be written only by queryjumblefuncs.c and
src/backend/nodes/, and should remain in an untouched constant state
when looked at by any external module loaded, included PGSS, because
it could be shared across more than one paths.

That's just to say that I can get behind the argument you are
presenting in patch 0002, to move the updates of
JumbleState.clocations[N].length to happen inside the query jumbling
code.  This relies on a query string anyway, which would be the same
for all the modules interested in interacting in a planner, analyze or
execution hook.

Now, I don't think that 0002 goes far enough: could it be possible to
do things so as we enforce a policy where modules cannot touch a
JumbleState at all?  That would mean painting more consts to prevent
manipulations of the Jumble state.

> Therefore, I think both fill_in_constant_lengths and
> generate_normalized_query should be moved. We can certainly start a new
> thread (0002) if more discussion is needed.

Yeah, that should be a separate discussion.  Let's focus on the bug at
hand, first, then consider improvements that could be done moving
forward.
--
Michael

Вложения

signature.asc

Re: Bug in pg_stat_statements

От

Álvaro Herrera

Дата:

28 октября, 13:31:23

On 2025-Oct-28, Michael Paquier wrote:

> FWIW, the line defining the separation between pg_stat_statements.c
> and queryjumblefuncs.c should rely on a pretty simple concept:
> JumbleState can be written only by queryjumblefuncs.c and
> src/backend/nodes/, and should remain in an untouched constant state
> when looked at by any external module loaded, included PGSS, because
> it could be shared across more than one paths.

I can get behind this as a principle.

> That's just to say that I can get behind the argument you are
> presenting in patch 0002, to move the updates of
> JumbleState.clocations[N].length to happen inside the query jumbling
> code.  This relies on a query string anyway, which would be the same
> for all the modules interested in interacting in a planner, analyze or
> execution hook.

Hmm, yeah, now that you mention this, it seems rather odd that
pgss_store() is messing with (modifying) the jstate it is given.  If we
had multiple modules using a planner_hook to interact with the query
representation, what each module sees would be different depending on
which hook is called first, which sounds wrong.  Maybe, as you say, we
do need to consider this a design flaw that should be fixed in a more
principled fashion ... and it's pretty hard to see that we could
backpatch any such fixes.  It's a pretty old issue though (you probably
notice I hesitate to call it a bug.)  So I agree with you: we should fix
the specific bug first across branches, and the reworking of the
jumbling framework should be done afterwards in master only.

I'm going to get the first patch pushed, after looking at it more
carefully.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Update: super-fast reaction on the Postgres bugs mailing list. The report
was acknowledged [...], and a fix is under discussion.
The wonders of open-source !"
             https://twitter.com/gunnarmorling/status/1596080409259003906

Re: Bug in pg_stat_statements

От

Álvaro Herrera

Дата:

28 октября, 20:13:30

On 2025-Oct-26, Dmitry Dolgov wrote:

> > On Fri, Oct 24, 2025 at 07:04:59PM -0500, Sami Imseih wrote:
> > v4 corrects some code comments.
> 
> The fix in the first patch looks good, thanks.

Yeah, I think this general idea is sensible.  However, I think we should
take it one step further and just remove last_loc entirely.  I think
this makes the code a bit clearer.  How about the attached?

Regarding 0002, as I said on my reply to Michael I think this is a good
idea on principle, but I suggest to discuss that in a separate thread.
I would seek routine names that match the current ones in queryjumble.h
a little better though.

> P.S. Adding Álvaro as a commiter of the affected feature, maybe he will
> help us to apply the fix.

Thanks for doing that, I'd have not noticed the thread otherwise.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Pero la cosa no es muy grave ..."     (le petit Nicolas -- René Goscinny)

Вложения

v5-0001-pg_stat_statements-Fix-handling-of-duplicate-cons.patch

Re: Bug in pg_stat_statements

От

Sami Imseih

Дата:

28 октября, 20:55:05

>
> > > On Fri, Oct 24, 2025 at 07:04:59PM -0500, Sami Imseih wrote:
> > > v4 corrects some code comments.
> >
> > The fix in the first patch looks good, thanks.
>
> Yeah, I think this general idea is sensible.  However, I think we should
> take it one step further and just remove last_loc entirely.  I think
> this makes the code a bit clearer.  How about the attached?

getting rid of last_loc makes sense because the list is sorted makes
sense. I like this, definitely cleaner.

One minor comment is to change is to remove the "let's save it" but
in the comments, as we are no longer saving a last_loc.

                /*
-                * We have a valid location for a constant that's not
a dupe, let's
-                * save it.  Lex tokens until we find the desired constant.
+                * We have a valid location for a constant that's not
a dupe. We Lex
+                * tokens until we find the desired constant.
                 */


--
Sami

Re: Bug in pg_stat_statements

От

Michael Paquier

Дата:

29 октября, 04:16:40

On Tue, Oct 28, 2025 at 12:31:23PM +0200, Alvaro Herrera wrote:
> Hmm, yeah, now that you mention this, it seems rather odd that
> pgss_store() is messing with (modifying) the jstate it is given.  If we
> had multiple modules using a planner_hook to interact with the query
> representation, what each module sees would be different depending on
> which hook is called first, which sounds wrong.  Maybe, as you say, we
> do need to consider this a design flaw that should be fixed in a more
> principled fashion ... and it's pretty hard to see that we could
> backpatch any such fixes.  It's a pretty old issue though (you probably
> notice I hesitate to call it a bug.)

It's hard to call that a bug when nothing in core is broken directly
because of it, we are just assuming that it impacts other extensions.
Now we have always been kind of bad in terms of defining what loaded
extensions should or not should be able to do, also depending on the
order their hooks are loaded.

> So I agree with you: we should fix
> the specific bug first across branches, and the reworking of the
> jumbling framework should be done afterwards in master only.

Agreedn.

> I'm going to get the first patch pushed, after looking at it more
> carefully.

Cool, thanks!
--
Michael

Вложения

signature.asc

Re: Bug in pg_stat_statements

От

Álvaro Herrera

Дата:

29 октября, 14:36:55

On 2025-Oct-28, Sami Imseih wrote:

> getting rid of last_loc makes sense because the list is sorted. I like
> this, definitely cleaner.
> 
> One minor comment is to change is to remove the "let's save it" but
> in the comments, as we are no longer saving a last_loc.

Ah, yeah, that's a good change.  I have pushed it now.

Thanks!

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"Entristecido, Wutra                     (canción de Las Barreras)
echa a Freyr a rodar
y a nosotros al mar"

Re: Bug in pg_stat_statements

От

Sami Imseih

Дата:

30 октября, 01:02:37

> > One minor comment is to change is to remove the "let's save it" but
> > in the comments, as we are no longer saving a last_loc.
>
> Ah, yeah, that's a good change.  I have pushed it now.

Thanks for pushing!

I will take 0002 on in a separate thread as there seems to
be agreement on doing something about it.

--
Sami Imseih
Amazon Web Services (AWS)

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Bug in pg_stat_statements

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения