Properly pathify the union planner

Поиск
Список
Период
Сортировка
От David Rowley
Тема Properly pathify the union planner
Дата
Msg-id CAApHDvpb_63XQodmxKUF8vb9M7CxyUyT4sWvEgqeQU-GB7QFoQ@mail.gmail.com
обсуждение исходный текст
Ответы Re: Properly pathify the union planner  (David Rowley <dgrowleyml@gmail.com>)
Список pgsql-hackers
The upper planner was pathified many years ago now.  That was a large
chunk of work and because of that, the union planner was not properly
pathified in that effort.  A small note was left in
recurse_set_operations() to mention about future work.

You can see this lack of pathification in make_union_unique() and
choose_hashed_setop(). There are heuristics in there to decide the
method to use instead of creating paths and letting add_path() decide
what's faster.

I've been working on improving this for the past few weeks and I'm not
quite as far along as I'd hoped, but what I have is perhaps worthy of
sharing.  For now, I've only improved UNIONs.

A UNION plan can now look like:

# explain (costs off) select * from a union select * from a;
                    QUERY PLAN
---------------------------------------------------
 Unique
   ->  Merge Append
         Sort Key: a.a
         ->  Index Only Scan using a_pkey on a
         ->  Index Only Scan using a_pkey on a a_1

Previously we'd have only considered Append -> Hash Aggregate or via
Append -> Sort -> Unique

To make this work, the query_planner() needs to know about setops, so
I've passed those down via the standard_qp_extra struct so that we can
choose pathkeys for the setops.

One part that still needs work is the EquivalanceClass building.
Because we only build the final targetlist for the Append after
planning all the append child queries, I ended up having to populate
the EquivalanceClasses backwards, i.e children first. add_eq_member()
determines if you're passing a child member by checking if parent !=
NULL.  Since I don't have a parent EquivalenceMember to pass,
em_is_child gets set wrongly, and that causes problems because
ec_has_const can get set to true when it shouldn't. This is a problem
as it can make a PathKey redundant when it isn't.  I wonder if I'll
need to change the signature of add_eq_member() and add an "is_child"
bool to force the EM to be a child em... Needs more thought...

I've not worked on the creation of Incremental Sort paths yet, or done
any path plumbing work to have UNION consider Gather Merge -> Unique
on already sorted paths.  I think to make similar improvements to
EXCEPT and INTERSECT we'd need a node executor node. Perhaps
nodeMergeAppendSetops.c which can be configured to do EXCEPT or
INTERSECT.  It could also perhaps handle UNION too then we can use
that instead of a Merge Append -> Unique.  That might save doing some
slot copying and improve performance further. I'm not planning on
doing that for the first stage. I only intend to improve UNION for
that and we have all the executor nodes to make that work already.

Anyway, I've attached my WIP patch for this.

David

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: document deviation from standard on REVOKE ROLE
Следующее
От: Xing Guo
Дата:
Сообщение: Re: Don't pass NULL pointer to strcmp().