Обсуждение: Missing Group Key in grouped aggregate

Поиск
Список
Период
Сортировка

Missing Group Key in grouped aggregate

От
Erik Nordström
Дата:
Hello,

I noticed that, beginning with PG16, grouped aggregates are missing the "Group Key" in the EXPLAIN output.

It seems the Agg node has numCols (number of grouping cols) set to zero in queries like

SELECT foo, count(*) FROM bar WHERE foo=1 GROUP BY foo;

In PG15, the "Group Key" is shown and the Agg node has numCols set as expected.

Is this intentional or a bug?

Best regards,

Erik

--
Database Architect, Timescale

Re: Missing Group Key in grouped aggregate

От
Tom Lane
Дата:
=?UTF-8?Q?Erik_Nordstr=C3=B6m?= <erik@timescale.com> writes:
> I noticed that, beginning with PG16, grouped aggregates are missing the
> "Group Key" in the EXPLAIN output.

> It seems the Agg node has numCols (number of grouping cols) set to zero in
> queries like

> SELECT foo, count(*) FROM bar WHERE foo=1 GROUP BY foo;

> In PG15, the "Group Key" is shown and the Agg node has numCols set as
> expected.

Looks sane to me: the planner now notices that there can only
be one group so it doesn't tell the GroupAgg node to worry about
making groups.  If it were missing in a case where there could be
multiple output groups, yes that'd be a bug.

If you want to run it to ground you could bisect to see where the
behavior changed, but you'd probably just find it was intentional.

            regards, tom lane



Re: Missing Group Key in grouped aggregate

От
Tomas Vondra
Дата:
On 2/20/24 16:53, Tom Lane wrote:
> =?UTF-8?Q?Erik_Nordstr=C3=B6m?= <erik@timescale.com> writes:
>> I noticed that, beginning with PG16, grouped aggregates are missing the
>> "Group Key" in the EXPLAIN output.
> 
>> It seems the Agg node has numCols (number of grouping cols) set to zero in
>> queries like
> 
>> SELECT foo, count(*) FROM bar WHERE foo=1 GROUP BY foo;
> 
>> In PG15, the "Group Key" is shown and the Agg node has numCols set as
>> expected.
> 
> Looks sane to me: the planner now notices that there can only
> be one group so it doesn't tell the GroupAgg node to worry about
> making groups.  If it were missing in a case where there could be
> multiple output groups, yes that'd be a bug.
> 
> If you want to run it to ground you could bisect to see where the
> behavior changed, but you'd probably just find it was intentional.
> 

I believe this changed in:

commit 8d83a5d0a2673174dc478e707de1f502935391a5
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date:   Wed Jan 18 12:37:57 2023 -0500

    Remove redundant grouping and DISTINCT columns.

    Avoid explicitly grouping by columns that we know are redundant
    for sorting, for example we need group by only one of x and y in
            SELECT ... WHERE x = y GROUP BY x, y
    This comes up more often than you might think, as shown by the
    changes in the regression tests.  It's nearly free to detect too,
    since we are just piggybacking on the existing logic that detects
    redundant pathkeys.  (In some of the existing plans that change,
    it's visible that a sort step preceding the grouping step already
    didn't bother to sort by the redundant column, making the old plan
    a bit silly-looking.)

    ...

It's not quite obvious from the commit message, but that's where git
bisect says the behavior changed.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company