Re: WIP: Upper planner pathification

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: WIP: Upper planner pathification
Дата
Msg-id 10131.1456844527@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: WIP: Upper planner pathification  (Greg Stark <stark@mit.edu>)
Ответы Re: WIP: Upper planner pathification  (Greg Stark <stark@mit.edu>)
Список pgsql-hackers
Greg Stark <stark@mit.edu> writes:
> On Tue, Mar 1, 2016 at 2:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> There are a couple of
>> regression test cases that change plans for the better, but it's sort of
>> accidental.  Those cases look like
>> 
>> select d.* from d left join (select * from b group by b.id, b.c_id) s
>> on d.a = s.id;
>> 
>> and what happens in HEAD is that the subquery chooses a hashagg plan
>> and then the upper query decides a mergejoin would be a good idea ...
>> so it has to sort the output of the hashagg.  With the patch, what
>> comes back from the subquery is a Path for the hashagg and a Path
>> for doing the GROUP BY with Sort/Uniq.  The second path is more expensive,
>> but it survives the add_path tournament because it can produce sorted
>> output.  Then the outer level discovers that it can use that to do its
>> mergejoin without a separate sort step, and that way is cheaper overall.

> This doesn't sound accidental at all. It sounds like a perfect example
> of exactly the benefits of this approach.

Well, my point is that no such path would have been generated if the
subquery hadn't had an internal reason to consider sorting on b.id.
The "accidental" part of this is that the subquery's GROUP BY key
matches what the outer query needs as a mergejoin key.


> (Actually the first hunk in the patch kind of surprised me. Do we dump
> node trees with -> notation currently? I thought they normally all
> looked like sexpressions.)

I chose in 19a541143 to not make PathTarget be a subclass of Node,
so that's kind of forced --- we can't print it by recursing to
_outNode().  We could change that but I'm not sure it would be an
improvement.  The restarget fields are embedded in RelOptInfo, not
sub-nodes of it, so pretending that they're independent nodes seems
a bit phony in its own way.  I'm not wedded to that reasoning though;
if people are more concerned about what pprint() output looks like,
we can change it.  Or we could make restarget actually be a subnode,
at the cost of one more palloc per RelOptInfo.
        regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Pavel Stehule
Дата:
Сообщение: Re: Sort returns more rows than seq scan?
Следующее
От: Tomas Vondra
Дата:
Сообщение: Re: checkpointer continuous flushing - V16