Generating code for query jumbling through gen_node_support.pl
От | Michael Paquier |
---|---|
Тема | Generating code for query jumbling through gen_node_support.pl |
Дата | |
Msg-id | Y5BHOUhX3zTH/ig6@paquier.xyz обсуждение исходный текст |
Ответы |
Re: Generating code for query jumbling through gen_node_support.pl
Re: Generating code for query jumbling through gen_node_support.pl |
Список | pgsql-hackers |
Hi all, This thread is a follow-up of the recent discussion about query jumbling with DDL statements, where the conclusion was that we'd want to generate all this code automatically for all the nodes: https://www.postgresql.org/message-id/36e5bffe-e989-194f-85c8-06e7bc88e6f7@amazon.com What this patch allows to do it to compute the same query ID for utility statements using their parsed Node state instead of their string, meaning that things like "BEGIN", "bEGIN" or "begin" would be treated the same, for example. But the main idea is not only that. I have implemented that as of the attached, where the following things are done: - queryjumble.c is moved to src/backend/nodes/, to stick with the other things for node equal/read/write/copy, renamed to jumblefuncs.c. - gen_node_support.c is extended to generate the functions and the switch for the jumbling. There are a few exceptions, as of the Lists and RangeTblEntry to do the jumbling consistently. - Two pg_node_attr() are added in consistency with the existing ones: no_jumble to discard completely a node from the the query jumbling and jumble_ignore to discard one field from the jumble. The patch is in a rather good shape, passes check-world and the CI, but there are a few things that need to be discussed IMO. Things could be perhaps divided in more patches, now the areas touched are quite different so it did not look like a big deal to me as the changes touch different areas and are straight-forward. The location of the Nodes is quite invasive because we only care about that for T_Const now in the query jumbling, and this could be compensated with a third pg_node_attr() that tracks for the "int location" of a Node whether it should participate in the jumbling or not. There is also an argument where we would want to not include by default new fields added to a Node, but that would require added more pg_node_attr() than what's done here. Note that the plan is to extend the normalization to some other parts of the Nodes, like CALL and SET as mentioned on the other thread. I have done nothing about that yet but doing so can be done in a few lines with the facility presented here (aka just add a location field). Hence, the normalization is consistent with the existing queryjumble.c for the fields and the nodes processed. In this patch, things are done so as the query ID is not computed anymore from the query string but from the Query. I still need to study the performance impact of that with short queries. If things prove to be noticeable in some cases, this stuff could be switched to use a new GUC where we could have a code path for the computation of utilityStmt using its string as a fallback. I am not sure that I want to enter in this level of complications, though, to keep things simple, but that's yet to be done. A bit more could be cut but pg_ident got in the way.. There are also a few things for pg_stat_statements where a query ID of 0 can be implied for utility statements in some cases. Generating this code leads to an overall removal of code as what queryjumble.c is generated automatically: 13 files changed, 901 insertions(+), 1113 deletions(-) I am adding that to the next commit fest. Thoughts? -- Michael
Вложения
В списке pgsql-hackers по дате отправления: