Обсуждение: query_id, pg_stat_activity, extended query protocol

Поиск

Список

Период

Сортировка

query_id, pg_stat_activity, extended query protocol

От

kaido vaikla

Дата:

12 июня 2023 г., 21:03:24

I have noticed, if query comes from PostgreSQL JDBC Driver, then query_id is not present in pg_stat_activity.  Erik Wienhold figured out that reason can be in extended query protocol (https://www.postgresql.org/message-id/1391613709.939460.1684777418070@office.mailbox.org)
My question is, is it expected or is it a bug: if extended query protocol then no query_id  in pg_stat_activity for running query.

br
Kaido

Re: query_id, pg_stat_activity, extended query protocol

От

Michael Paquier

Дата:

13 июня 2023 г., 03:16:07

On Mon, Jun 12, 2023 at 09:03:24PM +0300, kaido vaikla wrote:
> I have noticed, if query comes from PostgreSQL JDBC Driver, then query_id
> is not present in pg_stat_activity.  Erik Wienhold figured out that reason
> can be in extended query protocol (
> https://www.postgresql.org/message-id/1391613709.939460.1684777418070@office.mailbox.org
> )
> My question is, is it expected or is it a bug: if extended query protocol
> then no query_id  in pg_stat_activity for running query.

Well, you could say a bit of both, I guess.  The query ID is compiled
and stored in backend entries only after parse analysis, which is not
something that would happen when using the execution phase of the
extended query protocol, though it should be possible to access to the
Query nodes in the cached plans and their assigned query IDs.

FWIW, I'd like to think that we could improve the situation, requiring
a mix of calling pgstat_report_query_id() while feeding on some query
IDs retrieved from CachedPlanSource->query_list.  I have not in
details looked at how much could be achieved, TBH.
--
Michael

Вложения

signature.asc

Re: query_id, pg_stat_activity, extended query protocol

От

kaido vaikla

Дата:

13 июня 2023 г., 09:16:08

Thnx.
br
Kaido

On Tue, 13 Jun 2023 at 03:16, Michael Paquier <michael@paquier.xyz> wrote:

On Mon, Jun 12, 2023 at 09:03:24PM +0300, kaido vaikla wrote:
> I have noticed, if query comes from PostgreSQL JDBC Driver, then query_id
> is not present in pg_stat_activity. Erik Wienhold figured out that reason
> can be in extended query protocol (
> https://www.postgresql.org/message-id/1391613709.939460.1684777418070@office.mailbox.org
> )
> My question is, is it expected or is it a bug: if extended query protocol
> then no query_id in pg_stat_activity for running query.

Well, you could say a bit of both, I guess. The query ID is compiled
and stored in backend entries only after parse analysis, which is not
something that would happen when using the execution phase of the
extended query protocol, though it should be possible to access to the
Query nodes in the cached plans and their assigned query IDs.

FWIW, I'd like to think that we could improve the situation, requiring
a mix of calling pgstat_report_query_id() while feeding on some query
IDs retrieved from CachedPlanSource->query_list. I have not in
details looked at how much could be achieved, TBH.
--
Michael

Re: query_id, pg_stat_activity, extended query protocol

От

Dave Cramer

Дата:

20 марта, 16:07:34

FWIW, I'd like to think that we could improve the situation, requiring
a mix of calling pgstat_report_query_id() while feeding on some query
IDs retrieved from CachedPlanSource->query_list. I have not in
details looked at how much could be achieved, TBH.

This just cropped up as a pgjdbc github issue. Seems like something that should be addressed.

Dave

Re: query_id, pg_stat_activity, extended query protocol

От

"Imseih (AWS), Sami"

Дата:

23 апреля, 07:16:29

> FWIW, I'd like to think that we could improve the situation, requiring
> a mix of calling pgstat_report_query_id() while feeding on some query
> IDs retrieved from CachedPlanSource->query_list. I have not in
> details looked at how much could be achieved, TBH.

I was dealing with this today and found this thread. I spent some time
looking at possible solutions.

In the flow of extended query protocol, the exec_parse_message 
reports the queryId, but subsequent calls to exec_bind_message
and exec_execute_message reset the queryId when calling
pgstat_report_activity(STATE_RUNNING,..) as you can see below.
  
     /*
      * If a new query is started, we reset the query identifier as it'll only
      * be known after parse analysis, to avoid reporting last query's
      * identifier.
      */
     if (state == STATE_RUNNING)
         beentry->st_query_id = UINT64CONST(0);


So, I think the simple answer is something like the below. 
Inside exec_bind_message and exec_execute_message,
the query_id should be reported after pg_report_activity. 

diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 76f48b13d2..7ec2df91d5 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1678,6 +1678,7 @@ exec_bind_message(StringInfo input_message)
        debug_query_string = psrc->query_string;
 
        pgstat_report_activity(STATE_RUNNING, psrc->query_string);
+       pgstat_report_query_id(linitial_node(Query, psrc->query_list)->queryId, true);
 
        set_ps_display("BIND");
 
@@ -2146,6 +2147,7 @@ exec_execute_message(const char *portal_name, long max_rows)
        debug_query_string = sourceText;
 
        pgstat_report_activity(STATE_RUNNING, sourceText);
+       pgstat_report_query_id(portal->queryDesc->plannedstmt->queryId, true);
 
        cmdtagname = GetCommandTagNameAndLen(portal->commandTag, &cmdtaglen);


thoughts?


Regards,

Sami Imseih
Amazon Web Services (AWS)

Re: query_id, pg_stat_activity, extended query protocol

От

Andrei Lepikhov

Дата:

23 апреля, 07:42:41

On 23/4/2024 11:16, Imseih (AWS), Sami wrote:
>> FWIW, I'd like to think that we could improve the situation, requiring
>> a mix of calling pgstat_report_query_id() while feeding on some query
>> IDs retrieved from CachedPlanSource->query_list. I have not in
>> details looked at how much could be achieved, TBH.
> 
> I was dealing with this today and found this thread. I spent some time
> looking at possible solutions.
> 
> In the flow of extended query protocol, the exec_parse_message
> reports the queryId, but subsequent calls to exec_bind_message
> and exec_execute_message reset the queryId when calling
> pgstat_report_activity(STATE_RUNNING,..) as you can see below.
>    
>       /*
>        * If a new query is started, we reset the query identifier as it'll only
>        * be known after parse analysis, to avoid reporting last query's
>        * identifier.
>        */
>       if (state == STATE_RUNNING)
>           beentry->st_query_id = UINT64CONST(0);
> 
> 
> So, I think the simple answer is something like the below.
> Inside exec_bind_message and exec_execute_message,
> the query_id should be reported after pg_report_activity.
> 
> diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
> index 76f48b13d2..7ec2df91d5 100644
> --- a/src/backend/tcop/postgres.c
> +++ b/src/backend/tcop/postgres.c
> @@ -1678,6 +1678,7 @@ exec_bind_message(StringInfo input_message)
>          debug_query_string = psrc->query_string;
>   
>          pgstat_report_activity(STATE_RUNNING, psrc->query_string);
> +       pgstat_report_query_id(linitial_node(Query, psrc->query_list)->queryId, true);
>   
>          set_ps_display("BIND");
>   
> @@ -2146,6 +2147,7 @@ exec_execute_message(const char *portal_name, long max_rows)
>          debug_query_string = sourceText;
>   
>          pgstat_report_activity(STATE_RUNNING, sourceText);
> +       pgstat_report_query_id(portal->queryDesc->plannedstmt->queryId, true);
>   
>          cmdtagname = GetCommandTagNameAndLen(portal->commandTag, &cmdtaglen);
> 
> 
> thoughts?
In exec_bind_message, how can you be sure that queryId exists in 
query_list before the call of GetCachedPlan(), which will validate and 
lock the plan? What if some OIDs were altered in the middle?

-- 
regards, Andrei Lepikhov

Re: query_id, pg_stat_activity, extended query protocol

От

Michael Paquier

Дата:

23 апреля, 08:49:43

On Tue, Apr 23, 2024 at 11:42:41AM +0700, Andrei Lepikhov wrote:
> On 23/4/2024 11:16, Imseih (AWS), Sami wrote:
>> +       pgstat_report_query_id(linitial_node(Query, psrc->query_list)->queryId, true);
>>          set_ps_display("BIND");
>> @@ -2146,6 +2147,7 @@ exec_execute_message(const char *portal_name, long max_rows)
>>          debug_query_string = sourceText;
>>          pgstat_report_activity(STATE_RUNNING, sourceText);
>> +       pgstat_report_query_id(portal->queryDesc->plannedstmt->queryId, true);
>>          cmdtagname = GetCommandTagNameAndLen(portal->commandTag, &cmdtaglen);
>
> In exec_bind_message, how can you be sure that queryId exists in query_list
> before the call of GetCachedPlan(), which will validate and lock the plan?
> What if some OIDs were altered in the middle?

I am also a bit surprised with the choice of using the first Query
available in the list for the ID, FWIW.

Did you consider using \bind to show how this behaves in a regression
test?
--
Michael

Вложения

signature.asc

Re: query_id, pg_stat_activity, extended query protocol

От

Andrei Lepikhov

Дата:

23 апреля, 12:37:37

On 4/23/24 12:49, Michael Paquier wrote:
> On Tue, Apr 23, 2024 at 11:42:41AM +0700, Andrei Lepikhov wrote:
>> On 23/4/2024 11:16, Imseih (AWS), Sami wrote:
>>> +       pgstat_report_query_id(linitial_node(Query, psrc->query_list)->queryId, true);
>>>           set_ps_display("BIND");
>>> @@ -2146,6 +2147,7 @@ exec_execute_message(const char *portal_name, long max_rows)
>>>           debug_query_string = sourceText;
>>>           pgstat_report_activity(STATE_RUNNING, sourceText);
>>> +       pgstat_report_query_id(portal->queryDesc->plannedstmt->queryId, true);
>>>           cmdtagname = GetCommandTagNameAndLen(portal->commandTag, &cmdtaglen);
>>
>> In exec_bind_message, how can you be sure that queryId exists in query_list
>> before the call of GetCachedPlan(), which will validate and lock the plan?
>> What if some OIDs were altered in the middle?
> 
> I am also a bit surprised with the choice of using the first Query
> available in the list for the ID, FWIW.
> 
> Did you consider using \bind to show how this behaves in a regression
> test?
I'm not sure how to invent a test based on the \bind command - we need 
some pause in the middle.
But simplistic case with a prepared statement shows how the value of 
queryId can be changed if you don't acquire all the objects needed for 
the execution:

CREATE TABLE test();
PREPARE name AS SELECT * FROM test;
EXPLAIN (ANALYSE, VERBOSE, COSTS OFF) EXECUTE name;
DROP TABLE test;
CREATE TABLE test();
EXPLAIN (ANALYSE, VERBOSE, COSTS OFF) EXECUTE name;

/*
                             QUERY PLAN
-------------------------------------------------------------------
  Seq Scan on public.test (actual time=0.002..0.004 rows=0 loops=1)
  Query Identifier: 6750745711909650694

                             QUERY PLAN
-------------------------------------------------------------------
  Seq Scan on public.test (actual time=0.004..0.004 rows=0 loops=1)
  Query Identifier: -2597546769858730762
*/

We have different objects which can be changed - I just have invented 
the most trivial example to discuss.

-- 
regards, Andrei Lepikhov

Re: query_id, pg_stat_activity, extended query protocol

От

"Imseih (AWS), Sami"

Дата:

24 апреля, 04:40:45

> I am also a bit surprised with the choice of using the first Query
> available in the list for the ID, FWIW.


IIUC, the query trees returned from QueryRewrite
will all have the same queryId, so it appears valid to 
use the queryId from the first tree in the list. Right?

Here is an example I was working with that includes user-defined rules
that has a list with more than 1 tree.


postgres=# explain (verbose, generic_plan) insert into mytab values ($1) RETURNING pg_sleep($1), id ;
QUERY PLAN 
-----------------------------------------------------------
Insert on public.mytab (cost=0.00..0.01 rows=1 width=4)
Output: pg_sleep(($1)::double precision), mytab.id
-> Result (cost=0.00..0.01 rows=1 width=4)
Output: $1
Query Identifier: 3703848357297795425


Insert on public.mytab2 (cost=0.00..0.01 rows=0 width=0)
-> Result (cost=0.00..0.01 rows=1 width=4)
Output: $1
Query Identifier: 3703848357297795425


Insert on public.mytab3 (cost=0.00..0.01 rows=0 width=0)
-> Result (cost=0.00..0.01 rows=1 width=4)
Output: $1
Query Identifier: 3703848357297795425


Insert on public.mytab4 (cost=0.00..0.01 rows=0 width=0)
-> Result (cost=0.00..0.01 rows=1 width=4)
Output: $1
Query Identifier: 3703848357297795425
(20 rows)



> Did you consider using \bind to show how this behaves in a regression
> test?


Yes, this is precisely how I tested. Without the patch, I could not
see a queryId after 9 seconds of a pg_sleep, but with the patch it 
appears. See the test below.


## test query
select pg_sleep($1) \bind 30


## unpatched
postgres=# select 
query_id, 
query, 
now()-query_start query_duration, 
state 
from pg_stat_activity where pid <> pg_backend_pid()
and state = 'active';
query_id |        query         | query_duration  | state  
----------+----------------------+-----------------+--------
          | select pg_sleep($1) +| 00:00:08.604845 | active
          | ;                    |                 | 
(1 row)

## patched

postgres=# truncate table large;^C
postgres=# select 
    query_id, 
    query, 
    now()-query_start query_duration, 
    state 
from pg_stat_activity where pid <> pg_backend_pid()
and state = 'active';
      query_id       |        query         | query_duration | state  
---------------------+----------------------+----------------+--------
 2433215470630378210 | select pg_sleep($1) +| 00:00:09.6881  | active
                     | ;                    |                | 
(1 row)


For exec_execute_message, I realized that to report queryId for
Utility and non-utility statements, we need to report the queryId 
inside the portal routines where PlannedStmt contains the queryId.

Attached is the first real attempt at the fix. 

Regards,


Sami

Вложения

0001-v1-Fix-Extended-QUery-Protocol-handling-of-queryId.patch

Re: query_id, pg_stat_activity, extended query protocol

От

"Imseih (AWS), Sami"

Дата:

27 апреля, 16:54:57

> But simplistic case with a prepared statement shows how the value of
> queryId can be changed if you don't acquire all the objects needed for
> the execution:


> CREATE TABLE test();
> PREPARE name AS SELECT * FROM test;
> EXPLAIN (ANALYSE, VERBOSE, COSTS OFF) EXECUTE name;
> DROP TABLE test;
> CREATE TABLE test();
> EXPLAIN (ANALYSE, VERBOSE, COSTS OFF) EXECUTE name;

Hmm, you raise a good point. Isn't this a fundamental problem
with prepared statements? If there is DDL on the
relations of the prepared statement query, shouldn't the prepared
statement be considered invalid at that point and raise an error
to the user?

Regards,

Sami

Re: query_id, pg_stat_activity, extended query protocol

От

"David G. Johnston"

Дата:

27 апреля, 17:18:57

On Sat, Apr 27, 2024 at 6:55 AM Imseih (AWS), Sami <simseih@amazon.com> wrote:

Hmm, you raise a good point. Isn't this a fundamental problem
with prepared statements? If there is DDL on the
relations of the prepared statement query, shouldn't the prepared
statement be considered invalid at that point and raise an error
to the user?

We choose a arguably more user-friendly option:

https://www.postgresql.org/docs/current/sql-prepare.html

"""

Although the main point of a prepared statement is to avoid repeated parse analysis and planning of the statement, PostgreSQL will force re-analysis and re-planning of the statement before using it whenever database objects used in the statement have undergone definitional (DDL) changes or their planner statistics have been updated since the previous use of the prepared statement.

"""

David J.

Re: query_id, pg_stat_activity, extended query protocol

От

"Imseih (AWS), Sami"

Дата:

27 апреля, 17:57:44

> We choose a arguably more user-friendly option:

> https://www.postgresql.org/docs/current/sql-prepare.html

Thanks for pointing this out!

Regards,

Sami

Re: query_id, pg_stat_activity, extended query protocol

От

"Imseih (AWS), Sami"

Дата:

27 апреля, 22:08:41

>> But simplistic case with a prepared statement shows how the value of
>> queryId can be changed if you don't acquire all the objects needed for
>> the execution:




>> CREATE TABLE test();
>> PREPARE name AS SELECT * FROM test;
>> EXPLAIN (ANALYSE, VERBOSE, COSTS OFF) EXECUTE name;
>> DROP TABLE test;
>> CREATE TABLE test();
>> EXPLAIN (ANALYSE, VERBOSE, COSTS OFF) EXECUTE name;


> Hmm, you raise a good point. Isn't this a fundamental problem
> with prepared statements? If there is DDL on the
> relations of the prepared statement query, shouldn't the prepared
> statement be considered invalid at that point and raise an error
> to the user?

I tested v1 thoroughly.

Using the attached JDBC script for testing, I added some logging of the queryId 
being reported by the patch and added a breakpoint after sync [1] which at that 
point the locks are released on the table. I then proceeded to drop and recreate the table
and observed that the first bind after recreating the table still reports the
old queryId but the execute reports the correct queryId. This is because
the bind still has not had a chance to re-parse and re-plan after the
cache invalidation.


2024-04-27 13:51:15.757 CDT [43483] LOG:  duration: 21322.475 ms  execute S_1: select pg_sleep(10)
2024-04-27 13:51:21.591 CDT [43483] LOG:  duration: 0.834 ms  parse S_2: select from tab1 where id = $1
2024-04-27 13:51:21.591 CDT [43483] LOG:  query_id = -192969736922694368
2024-04-27 13:51:21.592 CDT [43483] LOG:  duration: 0.729 ms  bind S_2: select from tab1 where id = $1
2024-04-27 13:51:21.592 CDT [43483] LOG:  query_id = -192969736922694368
2024-04-27 13:51:21.592 CDT [43483] LOG:  duration: 0.032 ms  execute S_2: select from tab1 where id = $1
2024-04-27 13:51:32.501 CDT [43483] LOG:  query_id = -192969736922694368
2024-04-27 13:51:32.502 CDT [43483] LOG:  duration: 0.342 ms  bind S_2: select from tab1 where id = $1
2024-04-27 13:51:32.502 CDT [43483] LOG:  query_id = -192969736922694368
2024-04-27 13:51:32.502 CDT [43483] LOG:  duration: 0.067 ms  execute S_2: select from tab1 where id = $1
2024-04-27 13:51:42.613 CDT [43526] LOG:  query_id = -4766379021163149612
-- recreate the tables
2024-04-27 13:51:42.621 CDT [43526] LOG:  duration: 8.488 ms  statement: drop table if exists tab1;
2024-04-27 13:51:42.621 CDT [43526] LOG:  query_id = 7875284141628316369
2024-04-27 13:51:42.625 CDT [43526] LOG:  duration: 3.364 ms  statement: create table tab1 ( id int );
2024-04-27 13:51:42.625 CDT [43526] LOG:  query_id = 2967282624086800441
2024-04-27 13:51:42.626 CDT [43526] LOG:  duration: 0.936 ms  statement: insert into tab1 values (1);

-- this reports the old query_id
2024-04-27 13:51:45.058 CDT [43483] LOG:  query_id = -192969736922694368 

2024-04-27 13:51:45.059 CDT [43483] LOG:  duration: 0.913 ms  bind S_2: select from tab1 where id = $1
2024-04-27 13:51:45.059 CDT [43483] LOG:  query_id = 3010297048333693297
2024-04-27 13:51:45.059 CDT [43483] LOG:  duration: 0.096 ms  execute S_2: select from tab1 where id = $1
2024-04-27 13:51:46.777 CDT [43483] LOG:  query_id = 3010297048333693297
2024-04-27 13:51:46.777 CDT [43483] LOG:  duration: 0.108 ms  bind S_2: select from tab1 where id = $1
2024-04-27 13:51:46.777 CDT [43483] LOG:  query_id = 3010297048333693297
2024-04-27 13:51:46.777 CDT [43483] LOG:  duration: 0.024 ms  execute S_2: select from tab1 where id = $1

The easy answer is to not report queryId during the bind message, but I will look
at what else can be done here as it's good to have a queryId reported in this scenario
for cases there are long planning times and we rather not have those missed in 
pg_stat_activity sampling.


[1] https://github.com/postgres/postgres/blob/master/src/backend/tcop/postgres.c#L4877


Regards,

Sami

Вложения

Test.java

Re: query_id, pg_stat_activity, extended query protocol

От

Andrei Lepikhov

Дата:

28 апреля, 04:22:30

On 27/4/2024 20:54, Imseih (AWS), Sami wrote:
>> But simplistic case with a prepared statement shows how the value of
>> queryId can be changed if you don't acquire all the objects needed for
>> the execution:
> 
> 
>> CREATE TABLE test();
>> PREPARE name AS SELECT * FROM test;
>> EXPLAIN (ANALYSE, VERBOSE, COSTS OFF) EXECUTE name;
>> DROP TABLE test;
>> CREATE TABLE test();
>> EXPLAIN (ANALYSE, VERBOSE, COSTS OFF) EXECUTE name;
> 
> Hmm, you raise a good point. Isn't this a fundamental problem
> with prepared statements? If there is DDL on the
> relations of the prepared statement query, shouldn't the prepared
> statement be considered invalid at that point and raise an error
> to the user?
I don't think so. It may be any object, even stored procedure, that can 
be changed. IMO, the right option here is to report zero (like the 
undefined value of queryId) until the end of the parsing stage.

-- 
regards, Andrei Lepikhov

Re: query_id, pg_stat_activity, extended query protocol

От

"Imseih (AWS), Sami"

Дата:

01 мая, 06:07:06

Here is a new rev of the patch which deals with the scenario
mentioned by Andrei [1] in which the queryId may change
due to a cached query invalidation.


[1] https://www.postgresql.org/message-id/724348C9-8023-41BC-895E-80634E79A538%40amazon.com

Regards,

Sami

Вложения

0001-v2-Fix-Extended-Query-Protocol-handling-of-queryId.patch

Re: query_id, pg_stat_activity, extended query protocol

От

Andrei Lepikhov

Дата:

09 мая, 08:22:33

On 5/1/24 10:07, Imseih (AWS), Sami wrote:
> Here is a new rev of the patch which deals with the scenario
> mentioned by Andrei [1] in which the queryId may change
> due to a cached query invalidation.
> 
> 
> [1] https://www.postgresql.org/message-id/724348C9-8023-41BC-895E-80634E79A538%40amazon.com
I discovered the current state of queryId reporting and found that it 
may be unlogical: Postgres resets queryId right before query execution 
in simple protocol and doesn't reset it at all in extended protocol and 
other ways to execute queries.
I think we should generally report it when the backend executes a job 
related to the query with that queryId. This means it would reset the 
queryId at the end of the query execution.
However, the process of setting up the queryId is more complex. Should 
we set it at the beginning of query execution? This seems logical, but 
what about the planning process? If an extension plans a query without 
the intention to execute it for speculative reasons, should we still 
show the queryId? Perhaps we should reset the state right after planning 
to accurately reflect the current queryId.
See in the attachment some sketch for that - it needs to add queryId 
reset on abortion.

-- 
regards, Andrei Lepikhov

Вложения

queryid_report.diff

Re: query_id, pg_stat_activity, extended query protocol

От

"Imseih (AWS), Sami"

Дата:

15 мая, 06:24:05

> I discovered the current state of queryId reporting and found that it
> may be unlogical: Postgres resets queryId right before query execution
> in simple protocol and doesn't reset it at all in extended protocol and
> other ways to execute queries.

In exec_parse_message, exec_bind_message  and exec_execute_message,
the queryId is reset via pgstat_report_activity

> I think we should generally report it when the backend executes a job
> related to the query with that queryId. This means it would reset the
> queryId at the end of the query execution.

When the query completes execution and the session goes into a state 
other than "active", both the query text and the queryId should be of the 
last executed statement. This is the documented behavior, and I believe
it's the correct behavior.

If we reset queryId at the end of execution, this behavior breaks. Right?

> This seems logical, but
> what about the planning process? If an extension plans a query without
> the intention to execute it for speculative reasons, should we still
> show the queryId? Perhaps we should reset the state right after planning
> to accurately reflect the current queryId.

I think you are suggesting that during planning, the queryId
of the current statement being planned should not be reported.
 
If my understanding is correct, I don't think that is a good idea. Tools that 
snasphot pg_stat_activity will not be able to account for the queryId during
planning. This could mean that certain load on the database cannot be tied
back to a specific queryId.

Regards,

Sami

Re: query_id, pg_stat_activity, extended query protocol

От

Michael Paquier

Дата:

15 мая, 08:09:47

On Wed, May 15, 2024 at 03:24:05AM +0000, Imseih (AWS), Sami wrote:
>> I think we should generally report it when the backend executes a job
>> related to the query with that queryId. This means it would reset the
>> queryId at the end of the query execution.
>
> When the query completes execution and the session goes into a state
> other than "active", both the query text and the queryId should be of the
> last executed statement. This is the documented behavior, and I believe
> it's the correct behavior.
>
> If we reset queryId at the end of execution, this behavior breaks. Right?

Idle sessions keep track of the last query string run, hence being
consistent in pg_stat_activity and report its query ID is user
friendly.  Resetting it while keeping the string is less consistent.
It's been this way for years, so I'd rather let it be this way.

>> This seems logical, but
>> what about the planning process? If an extension plans a query without
>> the intention to execute it for speculative reasons, should we still
>> show the queryId? Perhaps we should reset the state right after planning
>> to accurately reflect the current queryId.
>
> I think you are suggesting that during planning, the queryId
> of the current statement being planned should not be reported.
>
> If my understanding is correct, I don't think that is a good idea. Tools that
> snasphot pg_stat_activity will not be able to account for the queryId during
> planning. This could mean that certain load on the database cannot be tied
> back to a specific queryId.

I'm -1 with the point of resetting the query ID based on what the
patch does, even if it remains available in the hooks.
pg_stat_activity is one thing, but you would also reduce the coverage
of log_line_prefix with %Q.  And that can provide really useful
debugging information in the code paths where the query ID would be
reset as an effect of the proposed patch.

The patch to report the query ID of a planned query when running a
query through a PortalRunSelect() feels more intuitive in the
information it reports.
--
Michael

Вложения

signature.asc

Re: query_id, pg_stat_activity, extended query protocol

От

Andrei Lepikhov

Дата:

15 мая, 16:29:00

On 15/5/2024 12:09, Michael Paquier wrote:
> On Wed, May 15, 2024 at 03:24:05AM +0000, Imseih (AWS), Sami wrote:
>>> I think we should generally report it when the backend executes a job
>>> related to the query with that queryId. This means it would reset the
>>> queryId at the end of the query execution.
>>
>> When the query completes execution and the session goes into a state
>> other than "active", both the query text and the queryId should be of the
>> last executed statement. This is the documented behavior, and I believe
>> it's the correct behavior.
>>
>> If we reset queryId at the end of execution, this behavior breaks. Right?
> 
> Idle sessions keep track of the last query string run, hence being
> consistent in pg_stat_activity and report its query ID is user
> friendly.  Resetting it while keeping the string is less consistent.
> It's been this way for years, so I'd rather let it be this way.
Okay, that's what I precisely wanted to understand: queryId doesn't have 
semantics to show the job that consumes resources right now—it is mostly 
about convenience to know that the backend processes nothing except 
(probably) this query.

-- 
regards, Andrei Lepikhov

Re: query_id, pg_stat_activity, extended query protocol

От

"Imseih (AWS), Sami"

Дата:

15 мая, 21:36:23

> Okay, that's what I precisely wanted to understand: queryId doesn't have
> semantics to show the job that consumes resources right now—it is mostly
> about convenience to know that the backend processes nothing except
> (probably) this query.

It may be a good idea to expose in pg_stat_activity or a
supplemental activity view information about the current state of the
query processing. i.e. Is it parsing, planning or executing a query or
is it processing a nested query. 

I can see this being useful and perhaps could be taken up in a 
separate thread.

Regards,

Sami

Re: query_id, pg_stat_activity, extended query protocol

От

Michael Paquier

Дата:

16 мая, 04:02:54

On Wed, May 15, 2024 at 06:36:23PM +0000, Imseih (AWS), Sami wrote:
>> Okay, that's what I precisely wanted to understand: queryId doesn't have
>> semantics to show the job that consumes resources right now—it is mostly
>> about convenience to know that the backend processes nothing except
>> (probably) this query.
>
> It may be a good idea to expose in pg_stat_activity or a
> supplemental activity view information about the current state of the
> query processing. i.e. Is it parsing, planning or executing a query or
> is it processing a nested query.

pg_stat_activity is already quite bloated with attributes, and I'd
suspect that there are more properties in a query that would be
interesting to track down at a thinner level as long as it mirrors a
dynamic activity of the query.  Perhaps a separate catalog like a
pg_stat_query would make sense, moving query_start there as well?
Catalog breakages are never fun, still always happen because the
reasons behind a backward-incompatible change make the picture better
in the long-term for users.
--
Michael

Вложения

signature.asc

Re: query_id, pg_stat_activity, extended query protocol

От

Andrei Lepikhov

Дата:

16 мая, 08:33:27

On 15.05.2024 10:24, Imseih (AWS), Sami wrote:
>> I discovered the current state of queryId reporting and found that it
>> may be unlogical: Postgres resets queryId right before query execution
>> in simple protocol and doesn't reset it at all in extended protocol and
>> other ways to execute queries.
> 
> In exec_parse_message, exec_bind_message  and exec_execute_message,
> the queryId is reset via pgstat_report_activity
> 
>> I think we should generally report it when the backend executes a job
>> related to the query with that queryId. This means it would reset the
>> queryId at the end of the query execution.
> 
> When the query completes execution and the session goes into a state
> other than "active", both the query text and the queryId should be of the
> last executed statement. This is the documented behavior, and I believe
> it's the correct behavior.
I discovered this case a bit.
As I can see, the origin of the problem is that the exec_execute_message 
report STATE_RUNNING, although  ExecutorStart was called in the 
exec_bind_message routine beforehand.
I'm unsure if it needs to call ExecutorStart in the bind code. But if we 
don't change the current logic, would it make more sense to move 
pgstat_report_query_id to the ExecutorRun routine?

-- 
regards, Andrei Lepikhov

Re: query_id, pg_stat_activity, extended query protocol

От

"Imseih (AWS), Sami"

Дата:

16 мая, 23:34:54

> I'm unsure if it needs to call ExecutorStart in the bind code. But if we
> don't change the current logic, would it make more sense to move
> pgstat_report_query_id to the ExecutorRun routine?

I initially thought about that, but for utility statements (CTAS, etc.) being 
executed with extended query protocol, we will still not advertise the queryId 
as  we should. This is why I chose to set the queryId in PortalRunSelect and
PortalRunMulti in v2 of the patch [1].

We can advertise the queryId inside ExecutorRun instead of
PortalRunSelect as the patch does, but we will still need to advertise 
the queryId inside PortalRunMulti.

[1] https://www.postgresql.org/message-id/FAB6AEA1-AB5E-4DFF-9A2E-BB320E6C3DF1%40amazon.com


Regards,

Sami

Re: query_id, pg_stat_activity, extended query protocol

От

Anthonin Bonnefoy

Дата:

17 июля, 12:32:49

Hi,

Wouldn't it be enough to call pgstat_report_query_id in ExecutorRun
and ProcessUtility? With those changes [1], both normal statements and
utility statements called through extended protocol will correctly
report the query_id.

-- Test utility statement with extended protocol
show all \bind \g

-- Check reported query_id
select query, query_id from pg_stat_activity where
application_name ='psql' and pid!=pg_backend_pid();
   query   |      query_id
-----------+---------------------
 show all  | -866221123969716490

[1] https://github.com/bonnefoa/postgres/commit/bf4b332d7b481549c6d9cfa70db51e39a305b9b2

Regards,
Anthonin

Re: query_id, pg_stat_activity, extended query protocol

От

Michael Paquier

Дата:

18 июля, 11:56:09

On Wed, Jul 17, 2024 at 11:32:49AM +0200, Anthonin Bonnefoy wrote:
> Wouldn't it be enough to call pgstat_report_query_id in ExecutorRun
> and ProcessUtility? With those changes [1], both normal statements and
> utility statements called through extended protocol will correctly
> report the query_id.

Interesting, and this position is really tempting.  By doing so you
would force the query ID to be set to the one from the CTAS and
EXPLAIN, because these would be executed before the inner queries, and
pgstat_report_query_id() with its non-force option does not overwrite
what would be already set (aka what should be the top-level query ID).

Using ExecutorRun() feels consistent with the closest thing I've
touched in this area lately in 1d477a907e63, because that's the only
code path that we are sure to take depending on the portal execution
(two execution scenarios depending on how rows are retrieved, as far
as I recall).  The comment should be perhaps more consistent with the
executor start counterpart.  So I would be OK with that..  The
location before the hook of ProcessUtility is tempting, as it would
take care of the case of PortalRunMulti().  However..  Joining with a
point from Sami upthread..

This is still not enough in the case of where we have a holdStore, no?
This is the case where we would do *one* ExecutorRun(), followed up by
a scan of the tuplestore in more than one execute message.  The v2
proposed upthread, by positioning a query ID to be set in
PortalRunSelect(), is still positioning that in two places.

Hmm...  How about being much more aggressive and just do the whole
business in exec_execute_message(), just before we do the PortalRun()?
I mean, that's the source of all our problems, and we know the
statements that the portal will work on so we could go through the
list, grab the first planned query and set the query ID based on that,
without caring about the portal patterns we would need to think about.

> [1] https://github.com/bonnefoa/postgres/commit/bf4b332d7b481549c6d9cfa70db51e39a305b9b2

Or use the following to download the patch, that I am attaching here:
https://github.com/bonnefoa/postgres/commit/bf4b332d7b481549c6d9cfa70db51e39a305b9b2.patch

Please attach things to your emails, if your repository disappears for
a reason or another we would lose knowledge in the archives of the
community lists.
--
Michael

Вложения

Re: query_id, pg_stat_activity, extended query protocol

От

Sami Imseih

Дата:

24 июля, 00:00:25

> On Wed, Jul 17, 2024 at 11:32:49AM +0200, Anthonin Bonnefoy wrote:
>> Wouldn't it be enough to call pgstat_report_query_id in ExecutorRun
>> and ProcessUtility? With those changes [1], both normal statements and
>> utility statements called through extended protocol will correctly
>> report the query_id.
>
> Interesting, and this position is really tempting.  By doing so you
> would force the query ID to be set to the one from the CTAS and
> EXPLAIN, because these would be executed before the inner queries, and
> pgstat_report_query_id() with its non-force option does not overwrite
> what would be already set (aka what should be the top-level query ID).
>
> Using ExecutorRun() feels consistent with the closest thing I've
> touched in this area lately in 1d477a907e63, because that's the only
> code path that we are sure to take depending on the portal execution
> (two execution scenarios depending on how rows are retrieved, as far
> as I recall).  The comment should be perhaps more consistent with the
> executor start counterpart.  So I would be OK with that..  The
> location before the hook of ProcessUtility is tempting, as it would
> take care of the case of PortalRunMulti().  However..  Joining with a
> point from Sami upthread..
>
> This is still not enough in the case of where we have a holdStore, no?
> This is the case where we would do *one* ExecutorRun(), followed up by
> a scan of the tuplestore in more than one execute message.  The v2
> proposed upthread, by positioning a query ID to be set in
> PortalRunSelect(), is still positioning that in two places.

Correct, I also don’t think ExecutorRun is enough. Another reason is we should also
be setting the queryId during bind, right before planning starts.
Planning could have significant impact on the server and I think we better
track the responsible queryId.

I have not tested the holdStore case.  IIUC the holdStore deals with fetching a
WITH HOLD CURSOR. Why would this matter for this conversation?

> Hmm...  How about being much more aggressive and just do the whole
> business in exec_execute_message(), just before we do the PortalRun()?
> I mean, that's the source of all our problems, and we know the
> statements that the portal will work on so we could go through the
> list, grab the first planned query and set the query ID based on that,
> without caring about the portal patterns we would need to think about.

Doing the work in exec_execute_message makes sense, although maybe
setting the queryId after pgstat_report_activity is better because it occurs earlier.
Also, we should do the same for exec_bind_message and set the queryId
right after pgstat_report_activity in this function as well.

We do have to account for the queryId changing after cache revalidation, so
we should still set the queryId inside GetCachedPlan in the case the query
underwent re-analysis. This means there is a chance that a queryId set at
the start of the exec_bind_message may be different by the time we complete
the function, in the case the query revalidation results in a different queryId.

See the attached v3.

Regards,

Sami

Вложения

v3-0001-Fix-queryId-reporting-in-Extended-Query-Protocol.patch

Re: query_id, pg_stat_activity, extended query protocol

От

Michael Paquier

Дата:

25 июля, 06:46:30

On Tue, Jul 23, 2024 at 04:00:25PM -0500, Sami Imseih wrote:
> Correct, I also don´t think ExecutorRun is enough. Another reason is we should also
> be setting the queryId during bind, right before planning starts.
> Planning could have significant impact on the server and I think we better
> track the responsible queryId.
>
> I have not tested the holdStore case.  IIUC the holdStore deals with fetching a
> WITH HOLD CURSOR. Why would this matter for this conversation?

Not only, see portal.h.  This matters for holdable cursors,
PORTAL_ONE_RETURNING and PORTAL_UTIL_SELECT.

> Doing the work in exec_execute_message makes sense, although maybe
> setting the queryId after pgstat_report_activity is better because it occurs earlier.
> Also, we should do the same for exec_bind_message and set the queryId
> right after pgstat_report_activity in this function as well.

Sounds fine by me (still need to check all three patterns).

+   if (list_length(psrc->query_list) > 0)
+       pgstat_report_query_id(linitial_node(Query, psrc->query_list)->queryId, false);

Something that slightly worries me is to assume that the first Query
in the query_list is fetched.  Using a foreach() for all three paths
may be better, jumping out at the loop when finding a valid query ID.

I have not looked at that entirely in details, and I'd need to check
if it is possible to use what's here for more predictible tests:
https://www.postgresql.org/message-id/ZqCMCS4HUshUYjGc%40paquier.xyz

> We do have to account for the queryId changing after cache revalidation, so
> we should still set the queryId inside GetCachedPlan in the case the query
> underwent re-analysis. This means there is a chance that a queryId set at
> the start of the exec_bind_message may be different by the time we complete
> the function, in the case the query revalidation results in a different queryId.

Makes sense to me.  I'd rather make that a separate patch, with, if
possible, its own tests (the case of Andrei with a DROP/CREATE TABLE) .
--
Michael

Вложения

signature.asc

Re: query_id, pg_stat_activity, extended query protocol

От

Anthonin Bonnefoy

Дата:

26 июля, 15:39:41

On Thu, Jul 18, 2024 at 10:56 AM Michael Paquier <michael@paquier.xyz> wrote:
> Please attach things to your emails, if your repository disappears for
> a reason or another we would lose knowledge in the archives of the
> community lists.

Noted and thanks for the reminder, I'm still learning about mailing
list etiquette.

> I have not looked at that entirely in details, and I'd need to check
> if it is possible to use what's here for more predictible tests:
> https://www.postgresql.org/message-id/ZqCMCS4HUshUYjGc%40paquier.xyz

For the tests, there are limited possibilities to check whether a
query_id has been set correctly.
- Checking pg_stat_activity is not possible in the regress tests as
you need a second session to check the reported query_id.
- pg_stat_statements can be used indirectly but you're limited to how
pgss uses query_id. For example, it doesn't rely on queryId in
ExecutorRun.

A possible solution I've been thinking of is to use a test module. The
module will assert on whether the queryId is set or not in parse, plan
and executor hooks. It will also check if the queryId reported in
pgstat matches the queryId at the root level.

This allows us to check that the queryId is correctly set with the
extended protocol. I've also found some queries which will trigger a
failure (ctas and cursor usage) though this is probably a different
issue from the extended protocol issue.

Regards,
Anthonin

Вложения

0001-Create-module-for-query_id-test.patch

Re: query_id, pg_stat_activity, extended query protocol

От

Michael Paquier

Дата:

27 июля, 01:36:01

On Fri, Jul 26, 2024 at 02:39:41PM +0200, Anthonin Bonnefoy wrote:
> For the tests, there are limited possibilities to check whether a
> query_id has been set correctly.
> - Checking pg_stat_activity is not possible in the regress tests as
> you need a second session to check the reported query_id.
> - pg_stat_statements can be used indirectly but you're limited to how
> pgss uses query_id. For example, it doesn't rely on queryId in
> ExecutorRun.
>
> A possible solution I've been thinking of is to use a test module. The
> module will assert on whether the queryId is set or not in parse, plan
> and executor hooks. It will also check if the queryId reported in
> pgstat matches the queryId at the root level.

FWIW, I was more thinking in the lines of a TAP test with
PostgreSQL::Test::BackgroundPsql to hold the sessions around while
doing pg_stat_activity lookups.

Using a test module like what you have is really tempting to rely on
the hooks for the work, that's something I'll try to think about more.

We could perhaps push the query ID into a table saving the state that
gets queried in the SQL test, using only assertions is not enough as
this makes the test moot with assertions disabled.  And actually,
there may be a point in just pushing safety assertions to be in the
core backend code, as a HEAD-only improvement.
--
Michael

Вложения

signature.asc

Re: query_id, pg_stat_activity, extended query protocol

От

"Imseih (AWS), Sami"

Дата:

14 августа, 05:40:48

Sounds fine by me (still need to check all three patterns).

+   if (list_length(psrc->query_list) > 0)
+       pgstat_report_query_id(linitial_node(Query, psrc->query_list)->queryId, false);

Something that slightly worries me is to assume that the first Query
in the query_list is fetched.  Using a foreach() for all three paths
may be better, jumping out at the loop when finding a valid query ID.

I cannot see how the inital node would not contain the queryId, but
to be on the safe side, your suggestion makes sense.

Are you thinking something like the below? In the foreach,
check for the first queryId != 0, set the queryId and then
break out of the loop

foreach(lc, psrc->query_list)
{
    Query *query = lfirst_node(Query, lc);
    if (query->queryId != UINT64CONST(0))
    {
        pgstat_report_query_id(query->queryId, false);
        break;
    }
}

We do have to account for the queryId changing after cache revalidation, so
we should still set the queryId inside GetCachedPlan in the case the query
underwent re-analysis. This means there is a chance that a queryId set at
the start of the exec_bind_message may be different by the time we complete
the function, in the case the query revalidation results in a different queryId.

Makes sense to me.  I'd rather make that a separate patch, with, if

I will create a separate patch for this.

possible, its own tests (the case of Andrei with a DROP/CREATE TABLE) .

In terms of testing, there are several options being discussed [1] including
BackgroundPsql and using hooks. I want to add a another idea which
is to rely on compute_plan_id = regress to log if my_query_id is a
non-zero value inside pgstat_report_query_id. Something like below:

@@ -640,6 +641,14 @@ pgstat_report_query_id(uint64 query_id, bool force)
        PGSTAT_BEGIN_WRITE_ACTIVITY(beentry);
        beentry->st_query_id = query_id;
        PGSTAT_END_WRITE_ACTIVITY(beentry);
+
+       if (compute_query_id == COMPUTE_QUERY_ID_REGRESS)
+       {
+               int64 queryId = pgstat_get_my_query_id();
+
+               if (queryId != UINT64CONST(0))
+                       elog(DEBUG3, "queryId value is not zero");
+       }

The number of logs can be counted and compared with what
is expected. For example, in simple query, I expect the queryId to be
set once. Using the \bind, I expect the queryId to be set 3 times ( parse/bind/execute).

Specifically for the DROP/CREATE TABLE test, the \parse and \bindx
being proposed in [2] can be used. The table can be dropped and
recreated after the \parse step. If we count the logs, we would expect
a total of 4 logs to be set (parse/bind/revalidation/execution). I think the testing discussion should be moved to a different thread.
What do you think?

Regards,

Sami

[1] https://www.postgresql.org/message-id/ZqQk0WHN8EMBEai9%40paquier.xyz
[2] https://www.postgresql.org/message-id/ZqCMCS4HUshUYjGc@paquier.xyz

Re: query_id, pg_stat_activity, extended query protocol

От

jian he

Дата:

31 августа, 07:47:41

On Thu, Aug 15, 2024 at 5:06 AM Imseih (AWS), Sami <samimseih@gmail.com> wrote:
>
> > I think the testing discussion should be moved to a different thread.
> > What do you think?
> See v4.
>
> 0001 deals with reporting queryId in exec_execute_message and
> exec_bind_message.
> 0002 deals with reporting queryId after a cache invalidation.
>
> There are no tests as this requires more discussion in a separate thread(?)
>

hi.
v4-0001 work as expected. i don't know how to test 0002

In 0001 and 0002, all foreach loops, we can use the new macro foreach_node.
see https://git.postgresql.org/cgit/postgresql.git/commit/?id=14dd0f27d7cd56ffae9ecdbe324965073d01a9ff



the following are the minimum tests I come up with for 0001

/* test \bind queryid exists */
select query_id is not null as query_id_exist
from pg_stat_activity where pid = pg_backend_pid() \bind \g



/* test that \parse \bind_named queryid exists */
select pg_backend_pid() as current_pid \gset pref01_
select query_id is not null as query_id_exist from pg_stat_activity
where pid = $1 \parse stmt11
\bind_named stmt11 :pref01_current_pid \g

Re: query_id, pg_stat_activity, extended query protocol

От

Andrei Lepikhov

Дата:

03 сентября, 01:30:18

On 14/8/2024 23:05, Imseih (AWS), Sami wrote:
>> I think the testing discussion should be moved to a different thread.
>> What do you think?
> See v4.
> 
> 0001 deals with reporting queryId in exec_execute_message and 
> exec_bind_message.
> 0002 deals with reporting queryId after a cache invalidation.
> 
> There are no tests as this requires more discussion in a separate thread(?)
At first, these patches look good.
But I have a feeling of some mess here:
queryId should be initialised at the top-level query. At the same time, 
the RevalidateCachedQuery routine can change this value in the case of 
the query tree re-validation.
You can say that this routine can't be called from a non-top-level query 
right now, except SPI. Yes, but what about extensions or future usage?

-- 
regards, Andrei Lepikhov

Re: query_id, pg_stat_activity, extended query protocol

От

Sami Imseih

Дата:

10 сентября, 08:27:00

> > >> I think the testing discussion should be moved to a different thread.
> > >> What do you think?
> > > See v4.
> > > 
> > > 0001 deals with reporting queryId in exec_execute_message and 
> > > exec_bind_message.
> > > 0002 deals with reporting queryId after a cache invalidation.
> > > 
> > > There are no tests as this requires more discussion in a separate thread(?)
> > At first, these patches look good.
> > But I have a feeling of some mess here:
> > queryId should be initialised at the top-level query. At the same time, 
> > the RevalidateCachedQuery routine can change this value in the case of 
> > the query tree re-validation.
> > You can say that this routine can't be called from a non-top-level query 
> > right now, except SPI. Yes, but what about extensions or future usage?


> This is a valid point. RevalidatePlanCache is forcing a 
> new queryId to be advertised ( 'true' as the second argument to 
> pgstat_report_query_id) . This means,
> v4-0002-Report-new-queryId-after-plancache-re-validation.patch 
> will result in a non top-level queryId being advertised.

An idea would be to add bool field called force_update_qid to 
CachedPlanSource, and this field can be set to 'true' after a call
to CreateCachedPlan. RevalidateCachedQuery will only update
the queryId if this value is 'true'.

For now, only exec_parse_message will set this field to 'true', 
but any caller can decide to set it to 'true' if there are other 
cases in the future.

What do you think?
 
--

Sami

Re: query_id, pg_stat_activity, extended query protocol

От

Sami Imseih

Дата:

12 сентября, 08:41:58

> After sleeping on it, I'd tend to slightly favor the last option in
> the back-branches and the second option on HEAD where we reduce the
> number of report calls. This way, we are a bit more careful in
>released branches by being more aggressive in reporting the query ID.

I agree with this because it will safely allow us to backpatch this
fix. 

> The tests in pg_stat_statements are one part I'm pretty sure is one
> good way forward. It is not perfect, but with the psql meta-commands

I played around with BackgrounsPsql. It works and gives us more flexibility
in testing, but I think the pg_stat_statements test are good enough for this
purpose. 

My only concern is this approach tests core functionality ( reporting of queryId )
in the tests of a contrib module ( pg_stat_statements ). Is that a valid
concern?

> Perhaps. I'd need to think through this one. Let's do things in
> order and see about the reports for the bind/execute messages, first,
> please?

Sure, that is fine.


Regards,

Sami

Re: query_id, pg_stat_activity, extended query protocol

От

Sami Imseih

Дата:

13 сентября, 02:58:27

> Do you think that we'd better replace the calls reporting the query ID
> in execMain.c by some assertions on HEAD? This won't work for
> ExecutorStart() because PREPARE statements (or actually EXECUTE,
> e.g. I bumped on that yesterday but I don't recall which one) would

Yes, adding the asserts in execMain.c is better, but there is complications
there due to the issue you mention. I think the issue you are bumping into
is when pg_stat_statements.track_utility = on ( default ), the assert in 
ExecutorStart will fail on EXECUTE. I believe it's because ( need to verify )
pg_stat_statements.c sets the queryId = 0 in the ProcessUtility hook [1].

> So your point would be to force this rule within the core executor on
> HEAD? I would not object to that in case we're missing more spots
> with the extended query protocol, actually. That would help us detect
> cases where we're still missing the query ID to be set and the
> executor should know about that.

Yes, but looking at how pg_stat_statements works with PREPARE/EXECUTE, 
I am now thinking it's better to Just keep the tests in pg_stat_statements. 
Having test coverage in pg_stat_statements is better than nothing, and
check-world ( or similar ) will be able to cacth such failures.


> Note that I'm not much worried about the dependency with
> pg_stat_statements. We already rely on it for query jumbling
> normalization for some parse node patterns like DISCARD, and query
> jumbling requires query IDs to be around. So that's not new.

Good point.

[1] https://github.com/postgres/postgres/blob/master/contrib/pg_stat_statements/pg_stat_statements.c#L1127-L1128

Regards,

Sami

Re: query_id, pg_stat_activity, extended query protocol

От

Sami Imseih

Дата:

18 сентября, 04:01:18

> Then, please see attached two lightly-updated patches. 0001 is for a
> backpatch down to v14. This is yours to force things in the exec and
> bind messages for all portal types, with the test (placed elsewhere in
> 14~15 branches). 0002 is for HEAD to add some sanity checks, blowing
> up the tests of pg_stat_statements if one is not careful with the
> query ID reporting.

These 2 patches look good to me; except for the slight typo
In the commit message of 0002. "backpatch" instead of "backpatck".

That leaves us with considering v5-0002 [1]. I do think this is good
for overall correctness of the queryId being advertised after a cache 
revalidation, even if users of pg_stat_activity will hardly notice this.

[1] https://www.postgresql.org/message-id/DB325894-3EE3-4B2E-A18C-4B34E7B2F5EC%40gmail.com 


Regards,

Sami

Re: query_id, pg_stat_activity, extended query protocol

От

Sami Imseih

Дата:

18 сентября, 05:39:17

> would help to grab a query ID. A second option I have in mind would
> be to set up an injection point that produces a NOTICE if a query ID
> is set when we end processing an execute message, then check the
> number of NOTICE messages produced as these can be predictible
> depending on the number of rows and the fetch size.. This won't fly
> far unless we can control the fetch size.

FWIW, I do like the INJECTION_POINT idea and actually mentioned something 
similar up the thread [1] for the revalidate cache case, but I can see it being applied
to all the other places we expect the queryId to be set. 


[1] https://www.postgresql.org/message-id/465EECA3-D98C-4E46-BBDB-4D057617DD89%40gmail.com

--

Sami

Re: query_id, pg_stat_activity, extended query protocol

От

Sami Imseih

Дата:

19 сентября, 02:13:10

> So, I have applied 0001 down to 14, followed by 0002 on HEAD.

Thank you!

> 0002 is going to be interesting to see moving forward. I am wondering
> how existing out-of-core extensions will react on that and if it will
> help catching up any issues. So, let's see how the experiment goes
> with HEAD on this side. Perhaps we'll have to revert 0002 at the end,
> or perhaps not...

If an extension breaks because of this, then it's doing something wrong __
Let's see what happens.

--
Sami

Re: query_id, pg_stat_activity, extended query protocol

От

Sami Imseih

Дата:

19 сентября, 02:14:07

> By the way, with the main issue fixed as of 933848d16dc9, could it be
> possible to deal with the plan cache part in a separate thread? This
> is from the start a separate thread to me, and we've done quite a bit
> here already.

Agree, will do start a new thread.

-- 

Sami

Re: query_id, pg_stat_activity, extended query protocol

От

Alexander Lakhin

Дата:

25 сентября, 20:00:00

Hello Michael and Sami,

18.09.2024 08:46, Michael Paquier wrote:
> So, I have applied 0001 down to 14, followed by 0002 on HEAD.
>

Please look at the script, which triggers Assert added by 24f520594:
(assuming shared_preload_libraries=pg_stat_statements)
SELECT repeat('x', 100) INTO t FROM generate_series(1, 100000);
CREATE FUNCTION f() RETURNS int LANGUAGE sql IMMUTABLE RETURN 0;
CREATE INDEX ON t(f());

TRAP: failed Assert("!IsQueryIdEnabled() || pgstat_get_my_query_id() != 0"), File: "execMain.c", Line: 300, PID:
1288609
ExceptionalCondition at assert.c:52:13
ExecutorRun at execMain.c:302:6
postquel_getnext at functions.c:903:24
fmgr_sql at functions.c:1198:15
ExecInterpExpr at execExprInterp.c:746:8
ExecInterpExprStillValid at execExprInterp.c:2034:1
ExecEvalExprSwitchContext at executor.h:367:13
evaluate_expr at clauses.c:4997:14
evaluate_function at clauses.c:4505:1
simplify_function at clauses.c:4092:12
eval_const_expressions_mutator at clauses.c:2591:14
expression_tree_mutator_impl at nodeFuncs.c:3550:12
eval_const_expressions_mutator at clauses.c:3712:1
eval_const_expressions at clauses.c:2267:1
RelationGetIndexExpressions at relcache.c:5079:20
BuildIndexInfo at index.c:2426:7
...

Best regards,
Alexander

Re: query_id, pg_stat_activity, extended query protocol

От

Sami Imseih

Дата:

27 сентября, 04:46:27

> Attached is the patch I am finishing with, with some new tests for
> BRIN and btree to force parallel builds with immutable expressions

> through functions.

glad to see the asserts are working as expected ad finding these issues.

I took a look at the patch and tested it. It looks good. My only concern

is the stability of using min_parallel_table_scan_size = 0. Will it always

guarantee parallel workers? Can we print some debugging that proves

a parallel worker was spun up?

Something like this I get with DEBUG1

postgres=*# CREATE INDEX btree_test_expr_idx ON btree_test_expr USING btree (btree_test_func());
DEBUG: building index "btree_test_expr_idx" on table "btree_test_expr" with request for 1 parallel workers

Also, we can just set the max_parallel_maintenance_workers to 1.

What do you think?

Regards,

Sami

DEBUG: building index "btree_test_expr_idx" on table "btree_test_expr" with request for 1 parallel workers

On Wed, Sep 25, 2024 at 8:08 PM Michael Paquier <michael@paquier.xyz> wrote:

On Wed, Sep 25, 2024 at 05:00:00PM +0300, Alexander Lakhin wrote:
> Please look at the script, which triggers Assert added by 24f520594:
> (assuming shared_preload_libraries=pg_stat_statements)

Or just compute_query_id = on.

> SELECT repeat('x', 100) INTO t FROM generate_series(1, 100000);
> CREATE FUNCTION f() RETURNS int LANGUAGE sql IMMUTABLE RETURN 0;
> CREATE INDEX ON t(f());
>
> TRAP: failed Assert("!IsQueryIdEnabled() || pgstat_get_my_query_id() != 0"), File: "execMain.c", Line: 300, PID: 1288609
> ExceptionalCondition at assert.c:52:13
> ExecutorRun at execMain.c:302:6
> postquel_getnext at functions.c:903:24
> fmgr_sql at functions.c:1198:15
> ExecInterpExpr at execExprInterp.c:746:8
> ExecInterpExprStillValid at execExprInterp.c:2034:1
> ExecEvalExprSwitchContext at executor.h:367:13

And this assertion is doing the job I want it to do, because it is
telling us that we are not setting a query ID when doing a parallel
btree build. The query string that we would report at the beginning
of _bt_parallel_build_main() is passed down as a parameter, but not
the query ID. Hence pg_stat_activity would report a NULL query ID
when spawning parallel workers in this cases, even if there is a query
string.

The same can be said for the parallel build for BRIN, that uses a lot
of logic taken from btree for there parallel parameters, and even
vacuum, as it goes through a parse analyze where its query ID would be
set. but that's missed in the workers.

See _bt_parallel_build_main(), _brin_parallel_build_main() and
parallel_vacuum_main() which are the entry point used by the workers
for all of them. For BRIN, note that I can get the same failure with
the following query, based on the table of your previous test that
would spawn a worker:
CREATE INDEX foo ON t using brin(f());

The recovery test 027_stream_regress.pl not catching these failures
means that we don't have tests with an index expression for such
parallel builds, or the assertion would have triggered. It looks like
this is just because we don't do a parallel btree build with an index
expression where we need to go through the executor to build its
IndexInfo.

Note that parallel workers launched by execParallel.c pass down the
query ID in a minimal PlannedStmt where we use pgstat_get_my_query_id,
so let's do the same for all these.

Attached is the patch I am finishing with, with some new tests for
BRIN and btree to force parallel builds with immutable expressions
through functions. These fail the assertions in the recovery TAP
test. It may be a good idea to keep these tests in the long-term
anyway. It took me a few minutes to find out that
min_parallel_table_scan_size and max_parallel_maintenance_workers was
enough to force workers to spawn even if tables have no data, to make
the tests cheaper.

Thoughts or comments?
--
Michael

Re: query_id, pg_stat_activity, extended query protocol

От

"Imseih (AWS), Sami"

Дата:

27 сентября, 04:55:37

> Attached is the patch I am finishing with, with some new tests for
> BRIN and btree to force parallel builds with immutable expressions
> through functions.

Sorry about my last reply. Not sure what happened with my email client.
Here it is again.

glad to see the asserts are working as expected ad finding these issues.
I took a look at the patch and tested it. It looks good. My only concern
is the stability of using min_parallel_table_scan_size = 0. Will it always
guarantee parallel workers? Can we print some debugging that proves
a parallel worker was spun up?

Something like this I get with DEBUG1

DEBUG: building index "btree_test_expr_idx" on table "btree_test_expr" with request for 1 parallel workers

What do you think?

Regards,

Sami

Re: query_id, pg_stat_activity, extended query protocol

От

Sami Imseih

Дата:

27 сентября, 10:01:12

> I am not sure. The GUCs pretty much enforce this behavior and I doubt
> that these are going to break moving on. Of course they would, but we
> are usually careful enough about that as long as it is possible to
> grep for them. For example see the BRIN case in pageinspect.

Yes, I see pageinspect does the same thing for the BRIN case.
That is probably OK for this case also.

> The usual method for output that we use to confirm parallelism would
> be EXPLAIN. Perhaps a potential target for CREATE INDEX now that it
> supports more modes? I don't know if that's worth it, just throwing
> one idea in the bucket of ideas.

Not sure about EXPLAIN for CREATE INDEX, since it's not a plannable
statement.

Maybe a CREATE INDEX VERBOSE, just Like  ANALYZE VERBOSE, 
VACUUM VERBOSE, etc. This will show the step that an index 
build is on (CONCURRENTLY has many steps), and can also show 
if parallel workers are launched for the index build.

--

Sami

Re: query_id, pg_stat_activity, extended query protocol

От

Alexander Lakhin

Дата:

02 октября, 01:00:00

Hello Michael,

01.10.2024 05:07, Michael Paquier wrote:
> On Mon, Sep 30, 2024 at 10:07:55AM +0900, Michael Paquier wrote:
> ...
> And done this part.

If I'm not missing something, all the patches discussed here are committed
now, so maybe I've encountered a new anomaly.

Please try the following script:
BEGIN;
PREPARE s AS SELECT 1;
SELECT $1 \bind 1 \g
EXECUTE s;

It produces for me:
TRAP: failed Assert("!IsQueryIdEnabled() || !pgstat_track_activities || !debug_query_string || pgstat_get_my_query_id()

!= 0"), File: "execMain.c", Line: 423, PID: 1296466

Best regards,
Alexander

Re: query_id, pg_stat_activity, extended query protocol

От

Alexander Lakhin

Дата:

02 октября, 09:00:00

Hello Michael,

02.10.2024 03:52, Michael Paquier wrote:
> Alexander, I've thought about a couple of fancy cases for
> ExecutorRun() but I could not break it.  Perhaps you have something in
> your sleeve that would also break this case?
> --

Fortunately, it's still pretty warm here, so I'm wearing T-shirt and my
sleeve isn't long enough for that, but if you gave me 2-3 days, I would
focus on researching this area...

Best regards,
Alexander

Re: query_id, pg_stat_activity, extended query protocol

От

Alexander Lakhin

Дата:

04 октября, 01:00:00

Hello Michael,

02.10.2024 06:29, Michael Paquier wrote:
> On Wed, Oct 02, 2024 at 06:00:00AM +0300, Alexander Lakhin wrote:
>> Fortunately, it's still pretty warm here, so I'm wearing T-shirt and my
>> sleeve isn't long enough for that, but if you gave me 2-3 days, I would
>> focus on researching this area...
> Sure, thanks.  I am also spending a few days thinking about patterns
> around that before doing anything.  The buildfarm is green, so there
> is some flebxibility.

I've managed to falsify the Assert in ExecutorRun() with the following:
SET compute_query_id = 'off';
SET track_activities = 'off';

CREATE PROCEDURE p1()
LANGUAGE plpgsql
AS $$
BEGIN
     PERFORM 1;
END;
$$;

CREATE PROCEDURE p2(x int)
LANGUAGE plpgsql
AS $$
BEGIN
   IF x = 1 THEN
     SET track_activities = 'on';
     SET compute_query_id = 'on';
   END IF;
   CALL p1();
END;
$$;

CALL p2(0);

CALL p2(1);

TRAP: failed Assert("!IsQueryIdEnabled() || !pgstat_track_activities || !debug_query_string || pgstat_get_my_query_id()

!= 0"), File: "execMain.c", Line: 312, PID: 3765791

Best regards,
Alexander

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: query_id, pg_stat_activity, extended query protocol

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения