Re: [PATCH] Equivalence Class Filters
От | Tom Lane |
---|---|
Тема | Re: [PATCH] Equivalence Class Filters |
Дата | |
Msg-id | 30810.1449335261@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | [PATCH] Equivalence Class Filters (David Rowley <david.rowley@2ndquadrant.com>) |
Ответы |
Re: [PATCH] Equivalence Class Filters
|
Список | pgsql-hackers |
David Rowley <david.rowley@2ndquadrant.com> writes: > As of today these Equivalence Classes only incorporate expressions which > use the equality operator, but what if the above query had been written as: > select * from t1 inner join t2 on t1.id = t2.id where t1.id <= 10; > Should we not be able to assume that t2.id is also <= 10? This sort of thing has been suggested multiple times before, and I've rejected it every time on the grounds that it would likely add far more planner cycles than it'd be worth, eg, time would be spent on searches for matching subexpressions whether or not anything was learned (and often nothing would be learned). While I've only read your description of the patch not the patch itself, the search methodology you propose seems pretty brute-force and unlikely to solve that issue. It's particularly important to avoid O(N^2) behaviors when there are N expressions ... Another issue that would need consideration is how to avoid skewing planner selectivity estimates with derived clauses that are fully redundant with other clauses. The existing EC machinery is mostly able to dodge that problem by generating just a minimal set of equality clauses from an EC, but I don't see how we'd make that work here. I'm also wondering why you want to limit it to comparisons to constants; that seems rather arbitrary. Lastly, in most cases knowing that t2.id <= 10 is just not worth all that much; it's certainly far less useful than an equality condition. It's not difficult to imagine that this would often be a net drag on runtime performance (never mind planner performance) by doing nothing except creating additional filter conditions the executor has to check. Certainly it would be valuable to know this if it let us exclude some partition of a table, but that's only going to happen in a small minority of queries. I'm not necessarily opposed to doing anything in this area, but so far I've not seen how to do it in a way that is attractive when planner complexity, performance, and end results are all considered. regards, tom lane
В списке pgsql-hackers по дате отправления: