Re: Terrible plan for join to nested union
От | Nate Allan |
---|---|
Тема | Re: Terrible plan for join to nested union |
Дата | |
Msg-id | 9B2D6747F4AB8A47BE45216B06DEDAF92ABE6739@PREXMB01.myfamily.int обсуждение исходный текст |
Ответ на | Re: Terrible plan for join to nested union (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: Terrible plan for join to nested union
|
Список | pgsql-performance |
>>Nate Allan <nallan@ancestry.com> writes: >> It seems to me that the join condition (and hence the restriction) should be pushed down into both sides of the unionto bring the cardinality limit from millions to 1. I'm imagining a rewrite like this: >> R(a) J (b U c) -> (b J R(a)) U (c J R(a)) ...where R = Restrict, J >> = Join, U = Union >[ eyes that suspiciously ... ] I'm not convinced that such a transformation is either correct in general (you seem to beassuming at least that A's join column is unique, and >what is the UNION operator supposed to do with A's other columns?)or likely to lead to a performance improvement in general. If there are more columns, you are correct that you might have to project off any additional columns within the union, andleave the join outside of the union intact to bring in the extra columns. Those are essentially the same considerationsas when making other rewrites though. As for this optimization making unions faster in general, I would arguethat it is rather easy to produce a plan superior to complete materialization of the union. >We possibly could push down a join condition on the inner side of a nestloop, similarly to what's done in the UNION ALLcase ... but that would require a complete >refactoring of what the planner does with UNIONs. By and large, very littleoptimization effort has been put into non-ALL UNION (or INTERSECT or EXCEPT). You should >not expect that to changeon a time scale of less than years. I hate to come across as contrary, but I'm pretty shocked by this answer for a couple reasons: 1) This is a clear-cut case of an untenable execution plan, essentially a bug in the planner. This response contradictsthe widely broadcast assertion that the PG community fixes planner bugs quickly and will not introduce hints becausethey would rather address these kinds of issues "correctly". 2) Why would more effort go into Union All rather than Union? Are people using Union All more than Union, and if so is thisbecause they actually want duplicates or is it because they've been trained to due to the performance problems with Union? Union All, in many people's opinions, shouldn't even exist in a true relational sense. Again, sorry if I'm coming off as abrasive, I've spent political capital pushing to get PG in on this project, and now I'ma little worried about whether it is going to work for this kind of scale and complexity, so I'm a little stressed. Ido appreciate your responses. Best, -Nate
В списке pgsql-performance по дате отправления: