Re: Parallelized polymorphic aggs, and aggtype vs aggoutputtype

Поиск

Список

Период

Сортировка

От	Robert Haas
Тема	Re: Parallelized polymorphic aggs, and aggtype vs aggoutputtype
Дата	17 июня 2016 г. 02:03:40
Msg-id	CA+TgmoaAjwaCRBe8w2UbpZD45uQ+TYwbLW7Dzf-e=VAq4m8q2w@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Parallelized polymorphic aggs, and aggtype vs aggoutputtype (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы	Re: Parallelized polymorphic aggs, and aggtype vs aggoutputtype
Список	pgsql-hackers

Дерево обсуждения

On Thu, Jun 16, 2016 at 6:40 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Meh.  I think this is probably telling us that trying to repurpose Aggref
> as the representation of a partial aggregate may have been mistaken.
> Or maybe just that fix_combine_agg_expr was a bad idea.  It seems likely
> to me that that could have been dispensed with altogether in return for
> slightly more work in create_grouping_paths, for instance transforming
> the upper-level Aggrefs into something that looks more like
>         Aggref(PartialAggref(original-arguments))
> whereupon the pre-existing match logic should be sufficient to replace
> the "PartialAggref(original-arguments)" subtree with a suitable Var
> in the upper aggregation plan node.

I don't mind if you feel the need to refactor this.  In David's
original patch, he had a PartialAggref node that basically acted as a
wrapper for an Aggref node, effectively behaving like a flag on the
underlying Aggref.  I thought that was ugly and argued for including
the required information in the Aggref itself.  Now what you are
proposing here is a bit different and it may work better, but there
are tricky things like how it all deparses in the EXPLAIN output - you
may recall that I fixed a problem in that area a while back.  We're
trying to represent a concept that doesn't exist at the SQL level, and
that makes things a bit complicated: if the Aggref's input is a
PartialAggref, will the deparse code be able to match that up to the
correct catalog entry?  But if you've got a good idea, have at it.

>> aggtype is one of the fields
>> that gets compared as part of that process, and it won't be the same
>> if you make aggtype mean the result of that particular node rather
>> than the result of the aggregate.  nodeAgg.c also uses aggtype for
>> matching purposes; not sure if there is a similar problem there or
>> not.
>
> AFAICS, nodeAgg.c's check on that is not logically necessary: if you have
> identical inputs and the same aggregate function then you should certainly
> expect the same output type.  The only reason we're making it is that the
> code originally used equal() to detect identical aggregates, and somebody
> slavishly copied all the comparisons when splitting up that test so as to
> distinguish "identical inputs" from "identical aggregates".  I'll reserve
> judgement on whether search_indexed_tlist_for_partial_aggref has any idea
> what it's doing in this regard.

Sure, it's all cargo-cult programming.  I don't want to presume to
read David's mind, but I think we were both thinking that minimizing
the amount of tinkering that we did would cut down on the likelihood
of bugs.  Now, you've found a bug that got through, so we can either
double down on the design we've already picked, or we can change it.
I'm not set on one approach or the other; I thought what we ended up
was fairly reasonable given the bloated mess that is struct Aggref,
but I've got no deep attachment to it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Parallelized polymorphic aggs, and aggtype vs aggoutputtype