pg_stat_statements: faster search by queryid
От | Karina Litskevich |
---|---|
Тема | pg_stat_statements: faster search by queryid |
Дата | |
Msg-id | CACiT8iaL-x=KtE6L1EFfS0jwxdwFQ8=uQjut1Q0o=5xDJdb-uQ@mail.gmail.com обсуждение исходный текст |
Ответы |
Re: pg_stat_statements: faster search by queryid
|
Список | pgsql-hackers |
Hi hackers, Aleksandra Bondar and I are proposing the following patch for pg_stat_statements. The idea. --------- Currently, to get statistics on a specific query, you should execute SELECT * FROM pg_stat_statements WHERE queryid = specific_queryid; This takes a long time because the pg_stat_statements() function forms tuples for all statistics it has first, and then they are filtered by the WHERE clause. If we provide a function like pg_stat_statements_by_queryid(queryid bigint), which would filter statistics by queryid while scanning pgss_hash and return only statistics with the specified queryid, that would be much faster. We can also easily add filtration by userid and dbid, which would lead us to a function like pg_stat_statements_filtered(queryid bigint, userid Oid, dbid Oid). In case some parameter is not specified, its default value is 0, and it means that no filtration is needed on this parameter. Kind of like pg_stat_statements_reset() chooses what statistics should be cleaned. If no parameter is specified, pg_stat_statements_filtered() should return all statistics that pg_stat_statements() would return. This led me to the idea that we should rather extend the pg_stat_statements() function than add a new function. The old way to call pg_stat_statements() will produce the same results, and specifying new parameters will produce filtered results. The patch. ---------- The extended pg_stat_statements() function idea is implemented in the patch attached. I can always rewrite the patch to add a new function and leave pg_stat_statements() as it is, though, if you think it's better to have a separate function for filtering. We've only written the code so far and want to get your opinion on that. If you like the idea, we'll also provide tests and docs. Any suggestions are welcome. Benchmarking. ------------- We prepared a simple test case here to show performance improvement. Download the attached script pg_stat_statements_prepare.sql and run the following in psql. CREATE EXTENSION pg_stat_statements; -- Fill in pg_stat_statements statistics \i /path/to/pgpro_stats_prepare_script.sql -- Get random query ID SELECT queryid AS rand_queryid FROM pg_stat_statements WHERE queryid IS NOT NULL ORDER BY random() LIMIT 1 \gset -- Turn on time measuring \timing -- Get statistics in the old way SELECT * FROM pg_stat_statements WHERE queryid = :rand_queryid; -- Get statistics in the new way SELECT * FROM pg_stat_statements(true, queryid => :rand_queryid); I'm getting that the new way is at least two times faster on my machine. I also compared the time for the old way on master with and without the patch. I get that the difference is within standard deviation. Best regards, Karina Litskevich Postgres Professional: http://postgrespro.com/
Вложения
В списке pgsql-hackers по дате отправления: