Re: Generating code for query jumbling through gen_node_support.pl
От | Michael Paquier |
---|---|
Тема | Re: Generating code for query jumbling through gen_node_support.pl |
Дата | |
Msg-id | Y9M+C2obzgRnXPNd@paquier.xyz обсуждение исходный текст |
Ответ на | Re: Generating code for query jumbling through gen_node_support.pl (Peter Eisentraut <peter.eisentraut@enterprisedb.com>) |
Ответы |
Re: Generating code for query jumbling through gen_node_support.pl
(Michael Paquier <michael@paquier.xyz>)
Re: Generating code for query jumbling through gen_node_support.pl (Peter Eisentraut <peter.eisentraut@enterprisedb.com>) |
Список | pgsql-hackers |
On Thu, Jan 26, 2023 at 09:37:13AM +0100, Peter Eisentraut wrote: > Ok, the documentation make sense now. I wonder what the performance impact > is. Probably, nobody cares about microoptimizing CREATE TABLE statements. > But BEGIN/COMMIT could matter. However, whatever you do in between the > BEGIN and COMMIT will already be jumbled, so you're already paying the > overhead. Hopefully, jumbling such simple commands will have no noticeable > overhead. > > In other words, we should test this and hopefully get rid of the 'string' > method. Yep. I have mentioned a few numbers upthread, and this deserves discussion. FYI, I have done more micro-benchmarking to compare both methods for utility queries by hijacking JumbleQuery() to run the computation in a tight loop run N times (could not come up with a better idea to avoid the repeated palloc/pfree overhead), as the path to stress is _jumbleNode(). See the attached, that should be able to apply on top of the latest patch set (named as .txt to not feed it to the CF bot, and need to recompile to switch the iteration). Using that, I can compile the following results for various cases (-O2 and compute_query_id=on): query | mode | iterations | avg_runtime_ns | avg_jumble_ns -------------------------+--------+------------+----------------+--------------- begin | string | 50000000 | 4.53116 | 4.54 begin | jumble | 50000000 | 30.94578 | 30.94 commit | string | 50000000 | 4.76004 | 4.74 commit | jumble | 50000000 | 31.4791 | 31.48 create table 1 column | string | 50000000 | 7.22836 | 7.08 create table 1 column | jumble | 50000000 | 152.10852 | 151.96 create table 5 columns | string | 50000000 | 12.43412 | 12.28 create table 5 columns | jumble | 50000000 | 352.88976 | 349.1 create table 20 columns | string | 5000000 | 49.591 | 48.2 create table 20 columns | jumble | 5000000 | 2272.4066 | 2271 drop table 1 column | string | 50000000 | 6.70538 | 6.56 drop table 1 column | jumble | 50000000 | 50.38 | 50.24 drop table 5 columns | string | 50000000 | 6.88256 | 6.74 drop table 5 columns | jumble | 50000000 | 50.02898 | 49.9 SET work_mem | string | 50000000 | 7.28752 | 7.28 SET work_mem | jumble | 50000000 | 91.66588 | 91.64 (16 rows) avg_runtime_ns is (query runtime / iterations) and avg_jumble_ns is the same with the difference between the start/end logs in the txt patch attached. The overhead to run the query does not matter much if you compare both. The time it takes to run a jumble is correlated to the number of nodes to go through for each query, and there is a larger gap for more nodes to go through. Well, a simple "begin" or "commit" query has its computation time increase from 4ns to 30ns in average which would be unnoticeable. The gap is larger for larger nodes, like SET, still we jump from 7ns to 90ns in this case. DDLs take the most hit with this method, where a 20-column CREATE TABLE jumps from 50ns to 2us (note that the iteration is 10 times lower here). At the end, that would be unnoticeable for the average user, I guess, but here are the numbers I get on my laptop :) -- Michael
Вложения
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Andres FreundДата:
Сообщение: Re: New strategies for freezing, advancing relfrozenxid early
Следующее
От: Peter GeogheganДата:
Сообщение: Re: New strategies for freezing, advancing relfrozenxid early