Обсуждение: {PROPOSAL] add session information column to pg_stat_statements

Поиск
Список
Период
Сортировка

{PROPOSAL] add session information column to pg_stat_statements

От
Sergei Agalakov
Дата:
I have renamed the thread [PROPOSAL] extend the object names to the 
qualified names in pg_stat_statements
started on
https://www.postgresql.org/message-id/9baf5c06-d6ab-c688-010c-843348e3d98c%40gmail.com
and ended on
https://www.postgresql.org/message-id/c93bb5ce-22bd-eb6b-a057-d0666585258f%40gmail.com

Currently pg_stat_statements doesn't have enough information to distinct 
queries executed in the different environment;
for example the queries with the same text from pg_stat_statements.query 
column can be the different queries if they were executed with
the different search path parameter.
The initial proposed solution was to extend names of the objects in the 
query to the qualified names, so for the text
"select * from t1"
from the pg_stat_statements.query column the new proposed column 
pg_stat_statements.query_qn would have the text
"select * from s1.t1"
Based on the discussion this solution has proved to be
a) relatively difficult to implement and slow to execute
b) resolves only the missed schema name problem but other differences in 
the execution environment would require some new extra columns

So the new proposed change addresses these concerns by been
a) faster (probably)
b) extensible
and, of course, it is backward compatible with the existing 
pg_stat_statements view.

We can add a column pg_stat_statements.session_info jsonb.
Its content can be defined by the new configuration parameter
pg_stat_statements.session_info ('current_schemas, current_user, 
session_user') // a subset of the data from the system information functions
or in the initial implementation it can be hardcoded to include at least 
current_schemas and current user.

and it will have data like
{
      "current_schemas" : ["pg_catalog", "s1", "s2", "public"],
      "current_user" : "user1",
      "session_user" : "user1"
}

It will allow the DBA/developer to understand and reproduce a 
performance issue, and will allow the deeper level of granularity for 
the reporting tools.

I don't know how difficult will be to implement something like that.


Thank you,

Sergei




Re: {PROPOSAL] add session information column to pg_stat_statements

От
legrand legrand
Дата:
I'm also very interested by collecting "search_path" information for
statements,
but this information may not be unique for pg_stat_statements key
(dbid,userid,queryid) ...

How would this 1-N relation be handled ?
1/ just catch initial session_info for each (dbid,userid,queryid),
2/ adding a sessid field in pgss key, and a table for distinct
sessid,session_info values,
3/ store session_info in an array,
4/ ...






--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html


Re: {PROPOSAL] add session information column to pg_stat_statements

От
Sergei Agalakov
Дата:
On 12/2/2018 2:22 AM, legrand legrand wrote:
> I'm also very interested by collecting "search_path" information for
> statements,
> but this information may not be unique for pg_stat_statements key
> (dbid,userid,queryid) ...
>
> How would this 1-N relation be handled ?
> 1/ just catch initial session_info for each (dbid,userid,queryid),
> 2/ adding a sessid field in pgss key, and a table for distinct
> sessid,session_info values,
> 3/ store session_info in an array,
> 4/ ...
>
According to documentation queryid is an "Internal hash code, computed 
from the statement's parse tree"
so I expect it to be different for the same query with different 
current_schemas or current_user
outside of the possible hash collisions if the changes current_schemas 
or current_user in ant way affect the execution of the query.
Queryid can be different even for the same combination of 
pg_stat_statements.query + pg_stat_statements.session_info (objects can 
be dropped and recreated  etc.)
I would say that option 1 makes the most sense.

When a user or an application changes search_path or session_user he/she 
expects that it will affect the execution of at least some queries,
that's why these setting were introduced in the first place.
If you know that you execute the same queries on the different schemas 
then you would like to group your statistics by query+current_schemas.
If you know that you change current_user for row level security then you 
would like to group your statistics by query+current_user.
But you need to preserve this session_info for every queryid in 
pg_stat_statements.

Thank you,

Sergei