Re: Parallel Aggregate
От | Robert Haas |
---|---|
Тема | Re: Parallel Aggregate |
Дата | |
Msg-id | CA+TgmoaYJvYrnDjnFaaVHuXh5BYTPwnP-5jiGsqCtKXK8nrAfw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Parallel Aggregate (David Rowley <david.rowley@2ndquadrant.com>) |
Ответы |
Re: Parallel Aggregate
(David Rowley <david.rowley@2ndquadrant.com>)
|
Список | pgsql-hackers |
On Thu, Mar 3, 2016 at 11:00 PM, David Rowley <david.rowley@2ndquadrant.com> wrote: > On 17 February 2016 at 17:50, Haribabu Kommi <kommi.haribabu@gmail.com> wrote: >> Here I attached a draft patch based on previous discussions. It still needs >> better comments and optimization. > > Over in [1] Tom posted a large change to the grouping planner which > causes large conflict with the parallel aggregation patch. I've been > looking over Tom's patch and reading the related thread and I've > observed 3 things: > > 1. Parallel Aggregate will be much easier to write and less code to > base it up top of Tom's upper planner changes. The latest patch does > add a bit of cruft (e.g create_gather_plan_from_subplan()) which won't > be required after Tom pushes the changes to the upper planner. > 2. If we apply parallel aggregate before Tom's upper planner changes > go in, then Tom needs to reinvent it again when rebasing his patch. > This seems senseless, so this is why I did this work. > 3. Based on the thread, most people are leaning towards getting Tom's > changes in early to allow a bit more settle time before beta, and > perhaps also to allow other patches to go in after (e.g this) > > So, I've done a bit of work and I've rewritten the parallel aggregate > code to base it on top of Tom's patch posted in [1]. Great! > 3. The code never attempts to mix and match Grouping Agg and Hash Agg > plans. e.g it could be an idea to perform Partial Hash Aggregate -> > Gather -> Sort -> Finalize Group Aggregate, or hash as in the Finalize > stage. I just thought doing this is more complex than what's really > needed, but if someone can think of a case where this would be a great > win then I'll listen, but you have to remember we don't have any > pre-sorted partial paths at this stage, so an explicit sort is > required *always*. This might change if someone invented partial btree > index scans... but until then... Actually, Rahila Syed is working on that. But it's not done yet, so presumably will not go into 9.6. I don't really see the logic of this, though. Currently, Gather destroys the input ordering, so it seems preferable for the finalize-aggregates stage to use a hash aggregate whenever possible, whatever the partial-aggregate stage did. Otherwise, we need an explicit sort. Anyway, it seems like the two stages should be costed and decided on their own merits - there's no reason to chain the two decisions together. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления:
Следующее
От: Robert HaasДата:
Сообщение: Re: More stable query plans via more predictable column statistics