automating RangeTblEntry node support
От | Peter Eisentraut |
---|---|
Тема | automating RangeTblEntry node support |
Дата | |
Msg-id | 4b27fc50-8cd6-46f5-ab20-88dbaadca645@eisentraut.org обсуждение исходный текст |
Ответы |
Re: automating RangeTblEntry node support
Re: automating RangeTblEntry node support |
Список | pgsql-hackers |
I have been looking into what it would take to get rid of the custom_read_write and custom_query_jumble for the RangeTblEntry node type. This is one of the larger and more complex exceptions left. (Similar considerations would also apply to the Constraint node type.) Allegedly, only certain fields of RangeTblEntry are valid based on rtekind. But exactly which ones seems to be documented and handled inconsistently. It seems that over time, new RTE kinds have "borrowed" fields that notionally belong to other RTE kinds, which is technically not a problem but creates a bit of a mess when trying to understand all this. I have some WIP patches to accompany this discussion. Let's start with the jumble function. I suppose that this was just carried over from the pg_stat_statements-specific code without any detailed review. For example, the "inh" field is documented to be valid in all RTEs, but it's only jumbled for RTE_RELATION. The "lateral" field isn't looked at at all. I wouldn't be surprised if there are more cases like this. In the first attached patch, I remove _jumbleRangeTblEntry() and instead add per-field query_jumble_ignore annotations to approximately match the behavior of the previous custom code. The pg_stat_statements test suite has some coverage of this. I get rid of switch on rtekind; this should be technically correct, since we do the equal and copy functions like this also. So for example the "inh" field is now considered in each case. But I left "lateral" alone. I suspect several of these new query_jumble_ignore should actually be dropped because the code was wrong before. In the second patch, I'm removing the switch on rtekind from range_table_mutator_impl(). This should be fine because all the subroutines can handle unset/NULL fields. And it removes one more place that needs to track knowledge about which fields are valid when. In the third patch, I'm removing the custom read/write functions for RangeTblEntry. Those functions wanted to have a few fields at the front to make the dump more legible; I'm doing that now by moving the fields up in the actual struct. Not done here, but something we should do is restructure the documentation of RangeTblEntry itself. I'm still thinking about the best way to structure this, but I'm thinking more like noting for each field when it's used, instead by block like it is now, which makes it awkward if a new RTE wants to borrow some fields. Now one could probably rightfully complain that having all these unused fields dumped would make the RangeTblEntry serialization bigger. I'm not sure who big of a problem that actually is, considering how many often-unset fields other node types have. But it deserves some consideration. I think the best way to work around that would be to have a mode that omits fields that have their default value (zero). This would be more generally useful; for example Query also has a bunch of fields that are not often set. I think this would be pretty easy to implement, for example like #define WRITE_INT_FIELD(fldname) \ if (full_mode || node->fldname) \ appendStringInfo(str, " :" CppAsString(fldname) " %d", node->fldname) There is also the discussion over at [0] about larger redesigns of the node serialization format. I'm also interested in that, but here I'm mainly trying to remove more special cases to make that kind of work easier in the future. Any thoughts about the direction? [0]: https://www.postgresql.org/message-id/flat/CACxu%3DvL_SD%3DWJiFSJyyBuZAp_2v_XBqb1x9JBiqz52a_g9z3jA%40mail.gmail.com
Вложения
В списке pgsql-hackers по дате отправления: