Re: Removing unneeded self joins
От | Alexander Kuzmenkov |
---|---|
Тема | Re: Removing unneeded self joins |
Дата | |
Msg-id | 31c2ee33-5db1-a43c-b964-8e2624020ae9@postgrespro.ru обсуждение исходный текст |
Ответ на | Re: Removing unneeded self joins (David Rowley <david.rowley@2ndquadrant.com>) |
Ответы |
Re: Removing unneeded self joins
|
Список | pgsql-hackers |
On 3/21/19 01:54, David Rowley wrote: > I really just don't think checking the unique indexes match is going > to cut it. You should be looking for a unique index where the join > clauses match on either side of the join, not looking independently > and then checking the indexes are the same ones. The bug you mention later is an implementation bug that can be fixed (I will expand on that below). Besides this, do you think current self-join detection algorithm has fundamental correctness problems? I am not aware of such problems and this algorithm reflects my understanding of what constitutes a removable self-join, so it would be helpful if you could explain what exactly is wrong with it. Your alternative idea sounds plausible, but I won't know it's correct for sure until I implement it, which is a nontrivial amount of work. I am also concerned that we will have to redo calculations similar to innerrel_is_unique(), because having near-zero overhead is a hard prerequisite for this patch, as was discussed upthread. In short, I am reluctant to implement a new approach to detection until I understand why the current one is fundamentally broken. > Here's an example of what can go wrong with your current code: This is a bug indeed. Unique index search is not exhaustive, so if many indexes match the join quals, we might not find the same index for both sides. I think this can be overcome if we switch to exhaustive search in relation_has_unique_index_for, and then try to find matching indexes in is_unique_self_join. Please see the new patch for the fix. > I also think this way would give you the subquery GROUP BY / DISTINCT > self join removal for just about free. Could you explain how exactly we can generalize join removal to the DISTINCT/GROUP BY case? I understand the removable self-joins as having the same row on both sides of join, as identified by ctid, but I'm not sure how to apply this to subqueries. Your earlier DISTINCT example looked like it needed a different kind of join removal, with different conditions for when it applies (please see my previous letter for details). Another thing I'd like to mention is that this patch splits in two independent parts, the detection of self-joins and their removal. While we are looking for agreement on the detection part, could you also review the removal part? I'm sure it has its own share of problems. -- Alexander Kuzmenkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Вложения
В списке pgsql-hackers по дате отправления: