Better management of mergejoinable operators
От | Tom Lane |
---|---|
Тема | Better management of mergejoinable operators |
Дата | |
Msg-id | 27157.1165964196@sss.pgh.pa.us обсуждение исходный текст |
Ответы |
Re: Better management of mergejoinable operators
|
Список | pgsql-hackers |
I noticed today that process_implied_equality() still contains an ugly hack that should have been got rid of awhile ago: it assumes that mergejoinable operators are named "=". This has been a bogus assumption for several releases, as illustrated by this failure: regression=# select * from text_tbl a,text_tbl b,text_tbl c where a.f1 ~=~ b.f1 and b.f1 ~=~ c.f1; ERROR: equality operator for types text and text should be merge-joinable, but isn't It can also be fooled by schema-search-path issues, if the needed operator exists but isn't in the path. Since we've not heard complaints from the field about this, I'm not feeling urgent about having a back-patchable solution, but I want to find one going forward. What is actually needed in this function is to be able to find a mergejoinable equality operator whose oprlsortop and oprrsortop are the two sortops already known for the input pathkey columns. We have a couple of problems doing that though: first, with the present system catalog layout there seems no way to do that short of a seqscan through all of pg_operator; and second, what if there's not a unique answer, ie, multiple equality operators alleging the same lsortop/rsortop? Right offhand I cannot see a reason why there should be different equality operators with the same sortops. (If anyone can come up with a plausible scenario for that, stop me here...) So what I'm thinking about is a unique index on oprlsortop/oprrsortop; that would both allow efficient search, and prevent multiple answers. Now we can't do that directly because most of the entries in pg_operator in fact contain zeroes in these columns, and would cause uniqueness failures. Probably the cleanest answer would be to allow these two columns to be NULL, not zero, when not meaningful; but that would be a bit of a mess to implement because of the code's assumption of fixed layout for pg_operator tuples. What I'm considering doing is moving the oprlsortop/oprrsortop/ oprltcmpop/oprgtcmpop fields out of pg_operator and into a new auxiliary catalog, named say pg_mergejoinop, that would have entries only for mergejoinable equality operators. This would have the same kind of relationship to pg_operator that pg_aggregate has to pg_proc: if a pg_operator entry has "oprcanmerge" true, then there's an extension row for it in pg_mergejoinop. The catalog would be fairly small and cheap to search (48 entries in a default install, as of CVS head), and could support a unique index to constrain the oprlsortop/oprrsortop columns. Comments? regards, tom lane
В списке pgsql-hackers по дате отправления: