Re: Parallel Aggregate
От | Robert Haas |
---|---|
Тема | Re: Parallel Aggregate |
Дата | |
Msg-id | CA+TgmoY=yiy-VXtiGDpV70dp3vwtAMnhkm1BSqisqJB5+gBm-Q@mail.gmail.com обсуждение исходный текст |
Ответ на | Parallel Aggregate (Haribabu Kommi <kommi.haribabu@gmail.com>) |
Ответы |
Re: Parallel Aggregate
Re: Parallel Aggregate |
Список | pgsql-hackers |
On Sun, Oct 11, 2015 at 10:07 PM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote: > Parallel aggregate is the feature doing the aggregation job parallel > with the help of Gather and > partial seq scan nodes. The following is the basic overview of the > parallel aggregate changes. > > Decision phase: > > Based on the following conditions, the parallel aggregate plan is generated. > > - check whether the below plan node is Gather + partial seq scan only. > > This is because to check whether the plan nodes that are present are > aware of parallelism or not? This is really not the right way of doing this. We should do something more general. Most likely, parallel aggregate should wait for Tom's work refactoring the upper planner to use paths. But either way, it's not a good idea to limit ourselves to parallel aggregation only in the case where there is exactly one base table. One of the things I want to do pretty early on, perhaps in time for 9.6, is create a general notion of partial paths. A Partial Seq Scan node creates a partial path. A Gather node turns a partial path into a complete path. A join between a partial path and a complete path creates a new partial path. This concept lets us consider, essentially, pushing joins below Gather nodes. That's quite powerful and could make Partial Seq Scan applicable to a much broader variety of use cases. If there are worthwhile partial paths for the final joinrel, and aggregation of that joinrel is needed, we can consider parallel aggregation using that partial path as an alternative to sticking a Gather node on there and then aggregating. > - Set the single_copy mode as true, in case if the below node of > Gather is a parallel aggregate. That sounds wrong. Single-copy mode is for when we need to be certain of running exactly one copy of the plan. If you're trying to have several workers aggregate in parallel, that's exactly what you don't want. Also, I think the path for parallel aggregation should probably be something like FinalizeAgg -> Gather -> PartialAgg -> some partial path here. I'm not clear whether that is what you are thinking or not. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления: