Re: POC, WIP: OR-clause support for indexes
От | Alena Rybakina |
---|---|
Тема | Re: POC, WIP: OR-clause support for indexes |
Дата | |
Msg-id | 16a7e3aa-24c0-4986-8820-ea2857bd7b6b@postgrespro.ru обсуждение исходный текст |
Ответ на | Re: POC, WIP: OR-clause support for indexes (Alexander Korotkov <aekorotkov@gmail.com>) |
Ответы |
Re: POC, WIP: OR-clause support for indexes
|
Список | pgsql-hackers |
On 30.10.2023 17:06, Alexander Korotkov wrote: > On Mon, Oct 30, 2023 at 3:40 PM Robert Haas <robertmhaas@gmail.com> wrote: >> On Thu, Oct 26, 2023 at 5:05 PM Peter Geoghegan <pg@bowt.ie> wrote: >>> On Thu, Oct 26, 2023 at 12:59 PM Robert Haas <robertmhaas@gmail.com> wrote: >>>> Alexander's example seems to show that it's not that simple. If I'm >>>> reading his example correctly, with things like aid = 1, the >>>> transformation usually wins even if the number of things in the OR >>>> expression is large, but with things like aid + 1 * bid = 1, the >>>> transformation seems to lose at least with larger numbers of items. So >>>> it's not JUST the number of OR elements but also what they contain, >>>> unless I'm misunderstanding his point. >>> Alexander said "Generally, I don't see why ANY could be executed >>> slower than the equivalent OR clause". I understood that this was his >>> way of expressing the following idea: >>> >>> "In principle, there is no reason to expect execution of ANY() to be >>> slower than execution of an equivalent OR clause (except for >>> noise-level differences). While it might not actually look that way >>> for every single type of plan you can imagine right now, that doesn't >>> argue for making a cost-based decision. It actually argues for fixing >>> the underlying issue, which can't possibly be due to some kind of >>> fundamental advantage enjoyed by expression evaluation with ORs". >>> >>> This is also what I think of all this. >> I agree with that, with some caveats, mainly that the reverse is to >> some extent also true. Maybe not completely, because arguably the >> ANY() formulation should just be straight-up easier to deal with, but >> in principle, the two are equivalent and it shouldn't matter which >> representation we pick. >> >> But practically, it may, and we need to be sure that we don't put in >> place a translation that is theoretically a win but in practice leads >> to large regressions. Avoiding regressions here is more important than >> capturing all the possible gains. A patch that wins in some scenarios >> and does nothing in others can be committed; a patch that wins in even >> more scenarios but causes serious regressions in some cases probably >> can't. > +1 > Sure, I've identified two cases where patch shows regression [1]. The > first one (quadratic complexity of expression processing) should be > already addressed by usage of hash. The second one (planning > regression with Bitmap OR) is not yet addressed. > > Links > 1. https://www.postgresql.org/message-id/CAPpHfduJtO0s9E%3DSHUTzrCD88BH0eik0UNog1_q3XBF2wLmH6g%40mail.gmail.com > I also support this approach. I have almost finished writing a patch that fixes the first problem related to the quadratic complexity of processing expressions by adding a hash table. I also added a check: if the number of groups is equal to the number of OR expressions, we assume that no expressions need to be converted and interrupt further execution. Now I am trying to fix the last problem in this patch: three tests have indicated a problem related to incorrect conversion. I don't think it can be serious, but I haven't figured out where the mistake is yet. I added log like that: ERROR: unrecognized node type: 0. -- Regards, Alena Rybakina Postgres Professional
Вложения
В списке pgsql-hackers по дате отправления: