Re: Parallel Aggregate

Поиск
Список
Период
Сортировка
От Amit Kapila
Тема Re: Parallel Aggregate
Дата
Msg-id CAA4eK1JOs02Z8+KuuXUECdye2Keae7ofZHHdT=o6RMTEi58Smw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Parallel Aggregate  (David Rowley <david.rowley@2ndquadrant.com>)
Ответы Re: Parallel Aggregate  (David Rowley <david.rowley@2ndquadrant.com>)
Список pgsql-hackers
On Thu, Mar 17, 2016 at 10:35 AM, David Rowley <david.rowley@2ndquadrant.com> wrote:
>
> On 17 March 2016 at 01:19, Amit Kapila <amit.kapila16@gmail.com> wrote:
> > Few assorted comments:
> >
> > 2.
> >  AggPath *
> >  create_agg_path(PlannerInfo *root,
> > @@ -2397,9 +2399,11 @@ create_agg_path(PlannerInfo *root,
> >
> > List *groupClause,
> >   List *qual,
> >   const AggClauseCosts
> > *aggcosts,
> > - double numGroups)
> > + double numGroups,
> > +
> > bool combineStates,
> > + bool finalizeAggs)
> >
> > Don't you need to set parallel_aware flag in this function as we do for
> > create_seqscan_path()?
>
> I don't really know the answer to that... I mean there's nothing
> special done in nodeAgg.c if the node is running in a worker or in the
> main process.
>

On again thinking about it, I think it is okay to set parallel_aware flag as false.  This flag means whether that particular node has any parallelism behaviour which is true for seqscan, but I think not for partial aggregate node.

Few other comments on latest patch:

1.

+ /*
+ * XXX does this need estimated for each partial path, or are they
+ * all 
going to be the same anyway?
+ */
+ dNumPartialGroups = get_number_of_groups(root,
+
clamp_row_est(partial_aggregate_path->rows),
+
rollup_lists,
+
rollup_groupclauses);

For considering partial groups, do we need to rollup related lists?

2.
+ hashaggtablesize = estimate_hashagg_tablesize(partial_aggregate_path,
+
 &agg_costs,
+
 dNumPartialGroups);
+
+ /*
+ * Generate a 
hashagg Path, if we can, but we'll skip this if the hash
+ * table looks like it'll exceed work_mem.
+
*/
+ if (can_hash && hashaggtablesize < work_mem * 1024L)


hash table size should be estimated only if can_hash is true.

3.
+ foreach(lc, grouped_rel->partial_pathlist)
+ {
+ Path   *path = 
(Path *) lfirst(lc);
+ double total_groups;
+
+ total_groups = path-
>parallel_degree * path->rows;
+
+ path = (Path *) create_gather_path(root, grouped_rel, path, 
NULL,
+   &total_groups);

Do you need to perform it foreach partial path or just do it for firstpartial path?



With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

В списке pgsql-hackers по дате отправления:

Предыдущее
От: David Rowley
Дата:
Сообщение: Re: Parallel Aggregate
Следующее
От: Tomas Vondra
Дата:
Сообщение: Re: Proposal: Generic WAL logical messages