Обсуждение: {PROPOSAL] add session information column to pg_stat_statements
I have renamed the thread [PROPOSAL] extend the object names to the qualified names in pg_stat_statements started on https://www.postgresql.org/message-id/9baf5c06-d6ab-c688-010c-843348e3d98c%40gmail.com and ended on https://www.postgresql.org/message-id/c93bb5ce-22bd-eb6b-a057-d0666585258f%40gmail.com Currently pg_stat_statements doesn't have enough information to distinct queries executed in the different environment; for example the queries with the same text from pg_stat_statements.query column can be the different queries if they were executed with the different search path parameter. The initial proposed solution was to extend names of the objects in the query to the qualified names, so for the text "select * from t1" from the pg_stat_statements.query column the new proposed column pg_stat_statements.query_qn would have the text "select * from s1.t1" Based on the discussion this solution has proved to be a) relatively difficult to implement and slow to execute b) resolves only the missed schema name problem but other differences in the execution environment would require some new extra columns So the new proposed change addresses these concerns by been a) faster (probably) b) extensible and, of course, it is backward compatible with the existing pg_stat_statements view. We can add a column pg_stat_statements.session_info jsonb. Its content can be defined by the new configuration parameter pg_stat_statements.session_info ('current_schemas, current_user, session_user') // a subset of the data from the system information functions or in the initial implementation it can be hardcoded to include at least current_schemas and current user. and it will have data like { "current_schemas" : ["pg_catalog", "s1", "s2", "public"], "current_user" : "user1", "session_user" : "user1" } It will allow the DBA/developer to understand and reproduce a performance issue, and will allow the deeper level of granularity for the reporting tools. I don't know how difficult will be to implement something like that. Thank you, Sergei
I'm also very interested by collecting "search_path" information for statements, but this information may not be unique for pg_stat_statements key (dbid,userid,queryid) ... How would this 1-N relation be handled ? 1/ just catch initial session_info for each (dbid,userid,queryid), 2/ adding a sessid field in pgss key, and a table for distinct sessid,session_info values, 3/ store session_info in an array, 4/ ... -- Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html
On 12/2/2018 2:22 AM, legrand legrand wrote: > I'm also very interested by collecting "search_path" information for > statements, > but this information may not be unique for pg_stat_statements key > (dbid,userid,queryid) ... > > How would this 1-N relation be handled ? > 1/ just catch initial session_info for each (dbid,userid,queryid), > 2/ adding a sessid field in pgss key, and a table for distinct > sessid,session_info values, > 3/ store session_info in an array, > 4/ ... > According to documentation queryid is an "Internal hash code, computed from the statement's parse tree" so I expect it to be different for the same query with different current_schemas or current_user outside of the possible hash collisions if the changes current_schemas or current_user in ant way affect the execution of the query. Queryid can be different even for the same combination of pg_stat_statements.query + pg_stat_statements.session_info (objects can be dropped and recreated etc.) I would say that option 1 makes the most sense. When a user or an application changes search_path or session_user he/she expects that it will affect the execution of at least some queries, that's why these setting were introduced in the first place. If you know that you execute the same queries on the different schemas then you would like to group your statistics by query+current_schemas. If you know that you change current_user for row level security then you would like to group your statistics by query+current_user. But you need to preserve this session_info for every queryid in pg_stat_statements. Thank you, Sergei