Re: Combining Aggregates

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: Combining Aggregates
Дата
Msg-id 46e012ac-424a-c90f-8d0b-86097a79fef1@2ndquadrant.com
обсуждение исходный текст
Ответ на Re: Combining Aggregates  (David Rowley <david.rowley@2ndquadrant.com>)
Ответы Re: Combining Aggregates  (David Rowley <david.rowley@2ndquadrant.com>)
Список pgsql-hackers
Hi,

On 03/14/2016 05:45 AM, David Rowley wrote:
> On 14 March 2016 at 15:20, David Rowley <david.rowley@2ndquadrant.com> wrote:
>> Current patch:
>> I've now updated the patch to base it on top of the parallel aggregate
>> patch in [2]. To apply the attached, you must apply [2] first!
>
> ...
>> [2] http://www.postgresql.org/message-id/CAKJS1f9Tr-+9aKWZ1XuHUnEJZ7GKKfo45b+4fCNj8DkrDZYK4g@mail.gmail.com
>
> The attached fixes a small thinko in apply_partialaggref_nodes().

After looking at the parallel aggregate patch, I also looked at this 
one, as it's naturally related. Sadly I haven't found any issue that I 
could nag about ;-) The patch seems well baked, as it was in the oven 
for quite a long time already.

The one concern I do have is that it only adds (de)serialize functions 
for SUM(numeric) and AVG(numeric). I think it's reasonable not to 
include that into the patch, but it will look a bit silly if that's all 
that gets into 9.6.

It's true plenty of aggregates is supported out of the box, as they use 
fixed-length data types for the aggregate state. But imagine users happy 
that their aggregate queries get parallelized, only to find out that 
sum(int8) disables that :-/

So I think we should try to support as many of the aggregates using 
internal as possible - it seems quite straight-forward to do that at 
least for the aggregates using the same internal representation (avg, 
sum, var_samp, var_pop, variance, ...). That's 26 aggregates out of the 
45 with aggtranstype=INTERNAL.

For the remaining aggregates (jsonb_*. json_*, string_agg, array_agg, 
percentile_) it's probably more complicated to serialize the state, and 
maybe even impossible to do that correctly (e.g. string_agg accumulates 
the strings in the order as received, but partial paths would build 
private lists - not sure if this can actually happen in practice).

I think it would be good to actually document which aggregates support 
parallel execution, and which don't. Currently the docs don't mention 
this at all, and it's tricky for regular users to deduce that as it's 
related to the type of the internal state and not to the input types. An 
example of that is the SUM(bigint) example mentioned above.

That's actually the one thing I think the patch is missing.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



В списке pgsql-hackers по дате отправления:

Предыдущее
От: David Steele
Дата:
Сообщение: Re: Password identifiers, protocol aging and SCRAM protocol
Следующее
От: Robert Haas
Дата:
Сообщение: Re: IF (NOT) EXISTS in psql-completion