Обсуждение: [PATCH] GROUP BY ALL

Поиск
Список
Период
Сортировка

[PATCH] GROUP BY ALL

От
David Christensen
Дата:
I see that there'd been some chatter but not a lot of discussion about
a GROUP BY ALL feature/functionality.  There certainly is utility in
such a construct IMHO.

The grammar is unambiguous, so can support this construct in lieu of
the traditional GROUP BY clause.  Enclosed is a patch which adds this
via just scanning the TargetEntry list and adding anything that is not
an aggregate function call to the groupList.

Still need some docs; just throwing this out there and getting some feedback.

Thanks,

David

Вложения

Re: [PATCH] GROUP BY ALL

От
"David G. Johnston"
Дата:
On Mon, Jul 22, 2024 at 1:55 PM David Christensen <david@pgguru.net> wrote:
I see that there'd been some chatter but not a lot of discussion about
a GROUP BY ALL feature/functionality.  There certainly is utility in
such a construct IMHO.

Still need some docs; just throwing this out there and getting some feedback.


I strongly dislike adding this feature.  I'd only consider supporting it if it was part of the SQL standard.

Code is written once and read many times.  This feature caters to the writer, not the reader.  And furthermore usage of this is prone to be to the writer's detriment as well.

David J.

Re: [PATCH] GROUP BY ALL

От
Isaac Morland
Дата:
On Mon, 22 Jul 2024 at 17:34, David G. Johnston <david.g.johnston@gmail.com> wrote:
On Mon, Jul 22, 2024 at 1:55 PM David Christensen <david@pgguru.net> wrote:
I see that there'd been some chatter but not a lot of discussion about
a GROUP BY ALL feature/functionality.  There certainly is utility in
such a construct IMHO.

Still need some docs; just throwing this out there and getting some feedback.


I strongly dislike adding this feature.  I'd only consider supporting it if it was part of the SQL standard.

Code is written once and read many times.  This feature caters to the writer, not the reader.  And furthermore usage of this is prone to be to the writer's detriment as well.

And for when this might be useful, the syntax for it already exists, although a spurious error message is generated:

odyssey=> select (uw_term).*, count(*) from uw_term group by uw_term;
ERROR:  column "uw_term.term_id" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: select (uw_term).*, count(*) from uw_term group by uw_term;
                ^

I'm not sure exactly what's going on here — it's like it's still seeing the table name in the field list as only a table name and not the value corresponding to the whole table as a row value (But in general I'm not happy with the system's ability to figure out that a column's value has only one possibility given the grouping columns). You can work around:

odyssey=> with t as (select uw_term, count(*) from uw_term group by uw_term) select (uw_term).*, count from t;

This query works.

Re: [PATCH] GROUP BY ALL

От
Tom Lane
Дата:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> On Mon, Jul 22, 2024 at 1:55 PM David Christensen <david@pgguru.net> wrote:
>> I see that there'd been some chatter but not a lot of discussion about
>> a GROUP BY ALL feature/functionality.  There certainly is utility in
>> such a construct IMHO.

> I strongly dislike adding this feature.  I'd only consider supporting it if
> it was part of the SQL standard.

Yeah ... my recollection is that we already rejected this idea.
If you want to re-litigate that, "throwing this out there" is
not a sufficient argument.

(Personally, I'd wonder exactly what ALL is quantified over: the
whole output of the FROM clause, or only columns mentioned in the
SELECT tlist, or what? And why that choice rather than another?)

            regards, tom lane



Re: [PATCH] GROUP BY ALL

От
Tom Lane
Дата:
Isaac Morland <isaac.morland@gmail.com> writes:
> And for when this might be useful, the syntax for it already exists,
> although a spurious error message is generated:

> odyssey=> select (uw_term).*, count(*) from uw_term group by uw_term;
> ERROR:  column "uw_term.term_id" must appear in the GROUP BY clause or be
> used in an aggregate function
> LINE 1: select (uw_term).*, count(*) from uw_term group by uw_term;
>                 ^

> I'm not sure exactly what's going on here

The SELECT entry is expanded into "uw_term.col1, uw_term.col2,
uw_term.col3, ...", and those single-column Vars don't match the
whole-row Var appearing in the GROUP BY list.  I guess if we
think this is important, we could add a proof rule saying that
a per-column Var is functionally dependent on a whole-row Var
of the same relation.  Odd that the point hasn't come up before
(though I guess that suggests that few people try this).

            regards, tom lane



Re: [PATCH] GROUP BY ALL

От
David Christensen
Дата:
On Mon, Jul 22, 2024 at 4:34 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:
>
> On Mon, Jul 22, 2024 at 1:55 PM David Christensen <david@pgguru.net> wrote:
>>
>> I see that there'd been some chatter but not a lot of discussion about
>> a GROUP BY ALL feature/functionality.  There certainly is utility in
>> such a construct IMHO.
>>
>> Still need some docs; just throwing this out there and getting some feedback.
>>
>
> I strongly dislike adding this feature.  I'd only consider supporting it if it was part of the SQL standard.
>
> Code is written once and read many times.  This feature caters to the writer, not the reader.  And furthermore usage
ofthis is prone to be to the writer's detriment as well. 

I'd say this feature (at least for me) caters to the investigator;
someone who is interactively looking at data hence why it would cater
to the writer.  Consider acquainting yourself with a large table that
has a large number of annoying-named fields where you want to look at
how different data is correlated or broken-down.  Something along the
lines of:

SELECT last_name, substring(first_name,1,1) as first_initial,
income_range, count(*) FROM census_data GROUP BY ALL;

If you are iteratively looking at things, adding or removing fields
from your breakdown, you only need to change it in a single place, the
tlist.  Additionally, expressions can be used transparently without
needing to repeat them.  (Yes, in practice, I'd often use GROUP BY 1,
2, say, but if you add more fields to this you need to edit in
multiple places.)

David



Re: [PATCH] GROUP BY ALL

От
David Christensen
Дата:
On Mon, Jul 22, 2024 at 5:29 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> "David G. Johnston" <david.g.johnston@gmail.com> writes:
> > On Mon, Jul 22, 2024 at 1:55 PM David Christensen <david@pgguru.net> wrote:
> >> I see that there'd been some chatter but not a lot of discussion about
> >> a GROUP BY ALL feature/functionality.  There certainly is utility in
> >> such a construct IMHO.
>
> > I strongly dislike adding this feature.  I'd only consider supporting it if
> > it was part of the SQL standard.
>
> Yeah ... my recollection is that we already rejected this idea.
> If you want to re-litigate that, "throwing this out there" is
> not a sufficient argument.

Heh, fair enough.  I actually wrote the patch after encountering the
syntax in DuckDB and it seemed easy enough to add to Postgres while
providing some utility, then ended up seeing a thread about it later.
I did not get the sense that this had been globally rejected; though
there were definitely voices against it, it seemed to trail off rather
than coming to a conclusion.

> (Personally, I'd wonder exactly what ALL is quantified over: the
> whole output of the FROM clause, or only columns mentioned in the
> SELECT tlist, or what? And why that choice rather than another?)

My intention here was to basically be a shorthand for "group by
specified non-aggregate fields in the select list".  Perhaps I'm not
being creative enough, but what is the interpretation/use case for
anything else? :-)

While there are other ways to accomplish these things, making an easy
way to GROUP BY with aggregate queries would be useful in the field,
particularly when doing iterative discovery work would save a lot of
time with a situation that is both detectable and hits users with
errors all the time.

I'm not married to the exact syntax of this feature; anything else
short and consistent could work if `ALL` is considered to potentially
gain a different interpretation in the future.

David



Re: [PATCH] GROUP BY ALL

От
David Christensen
Дата:
On Tue, Jul 23, 2024 at 8:21 AM Andrei Borodin <x4mmm@yandex-team.ru> wrote:
>
> On 23 Jul 2024, at 00:40, Isaac Morland <isaac.morland@gmail.com> wrote:
>
> odyssey=> select (uw_term).*, count(*) from uw_term group by uw_term;
> ERROR:  column "uw_term.term_id" must appear in the GROUP BY clause or be used in an aggregate function
> LINE 1: select (uw_term).*, count(*) from uw_term group by uw_term;
>
>
> AFAIR this problem was solved in my implementation [0]
>
> On 23 Jul 2024, at 01:29, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> (Personally, I'd wonder exactly what ALL is quantified over: the
> whole output of the FROM clause, or only columns mentioned in the
> SELECT tlist, or what? And why that choice rather than another?)
>
>
> I'd like to have GROUP BY AUTO (I also proposed version GROUP BY SURPRISE ME). But I wouldn't like to open pandora
boxof syntax sugar extensions which may will be incompatible with future standards. 
> If we could have extensible grammar - I'd be happy to have a lot of such enhancements. My top 2 are FROM table SELECT
columnand better GROUP BY. 

GROUP BY AUTO also seems fine here to me; I understand the desire to
avoid major incompatible syntax changes; GROUP BY ALL does exist in
multiple products so it's not unprecedented.

I wrote my patch before seeing your thread, sorry I missed that. :-)

David



Re: [PATCH] GROUP BY ALL

От
David Christensen
Дата:
On Mon, Jul 22, 2024 at 4:41 PM Isaac Morland <isaac.morland@gmail.com> wrote:

> And for when this might be useful, the syntax for it already exists, although a spurious error message is generated:
>
> odyssey=> select (uw_term).*, count(*) from uw_term group by uw_term;
> ERROR:  column "uw_term.term_id" must appear in the GROUP BY clause or be used in an aggregate function
> LINE 1: select (uw_term).*, count(*) from uw_term group by uw_term;
>                 ^

This is with my patch, or existing postgres?  Grouping by record is
not actually what this patch is trying to do, so perhaps there is some
ambiguity; this is intended to GROUP BY any select item that is not an
aggregate function.



Re: [PATCH] GROUP BY ALL

От
Laurenz Albe
Дата:
On Tue, 2024-07-23 at 08:37 -0500, David Christensen wrote:
> My intention here was to basically be a shorthand for "group by
> specified non-aggregate fields in the select list".  Perhaps I'm not
> being creative enough, but what is the interpretation/use case for
> anything else? :-)

I am somewhat against this feature.
It is too much magic for my taste.

It might be handy for interactive use, but I would frown at an application
that uses code like that, much like I'd frown at "SELECT *" in application code.

Yours,
Laurenz Albe



Re: [PATCH] GROUP BY ALL

От
David Christensen
Дата:
On Tue, Jul 23, 2024 at 10:57 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
>
> On Tue, 2024-07-23 at 08:37 -0500, David Christensen wrote:
> > My intention here was to basically be a shorthand for "group by
> > specified non-aggregate fields in the select list".  Perhaps I'm not
> > being creative enough, but what is the interpretation/use case for
> > anything else? :-)
>
> I am somewhat against this feature.
> It is too much magic for my taste.
>
> It might be handy for interactive use, but I would frown at an application
> that uses code like that, much like I'd frown at "SELECT *" in application code.

Sure, not everything that makes things easier is strictly necessary;
we could require `CAST(field AS text)` instead of `::text`, make
subqueries required for transforming oids into specific system tables
instead of `::regfoo` casts,  any number of other choices, remove
`SELECT *` as a parse option, but making it easier to do common things
interactively as a DBA has value as well.

David



Re: [PATCH] GROUP BY ALL

От
"David G. Johnston"
Дата:
On Tue, Jul 23, 2024 at 9:48 AM David Christensen <david@pgguru.net> wrote:

Sure, not everything that makes things easier is strictly necessary;
we could require `CAST(field AS text)` instead of `::text`,

Probably should have...being standard and all.  Though syntactic sugar is quite different from new behavior - transforming :: to CAST is straight-forward.

make
subqueries required for transforming oids into specific system tables
instead of `::regfoo` casts,

Since OID is non-standard this falls within our purview.

  any number of other choices, remove
`SELECT *` as a parse option,

Again, standard dictated.

but making it easier to do common things
interactively as a DBA has value as well.


Agreed, but this isn't a clear-cut win, and doesn't have standard conformance to tip the scale over fully.

Also, there are so many better tools for data exploration.  Removing this quirk only marginally closes that gap.

David J.

Re: [PATCH] GROUP BY ALL

От
Paul Jungwirth
Дата:
On 7/22/24 15:43, Tom Lane wrote:
> Isaac Morland <isaac.morland@gmail.com> writes:
>> And for when this might be useful, the syntax for it already exists,
>> although a spurious error message is generated:
> 
>> odyssey=> select (uw_term).*, count(*) from uw_term group by uw_term;
>> ERROR:  column "uw_term.term_id" must appear in the GROUP BY clause or be
>> used in an aggregate function
>> LINE 1: select (uw_term).*, count(*) from uw_term group by uw_term;
>>                  ^
> 
>> I'm not sure exactly what's going on here
> 
> The SELECT entry is expanded into "uw_term.col1, uw_term.col2,
> uw_term.col3, ...", and those single-column Vars don't match the
> whole-row Var appearing in the GROUP BY list.  I guess if we
> think this is important, we could add a proof rule saying that
> a per-column Var is functionally dependent on a whole-row Var
> of the same relation.  Odd that the point hasn't come up before
> (though I guess that suggests that few people try this).

I was just using this group-by-row feature last week to implement a temporal outer join in a way 
that would work for arbitrary tables. Here is some example SQL:

https://github.com/pjungwir/temporal_ops/blob/b10d65323749faa6c47956db2e8f95441e508fce/sql/outer_join.sql#L48-L66

That does `GROUP BY a` then `SELECT (x.a).*`.[1]

It is very useful for writing queries that don't want to know about the structure of the row.

I noticed the same error as Isaac. I worked around the problem by wrapping it in a subquery and 
decomposing the row outside. It's already an obscure feature, and an easy workaround might be why 
you haven't heard complaints before. I wouldn't mind writing a patch for that rule when I get a 
chance (if no one else gets to it first.)

[1] Actually I see it does `GROUP BY a, a.valid_at`, but that is surely more than I need. I think 
that `a.valid_at` is leftover from a previous version of the query.

Yours,

-- 
Paul              ~{:-)
pj@illuminatedcomputing.com



Re: [PATCH] GROUP BY ALL

От
Peter Eisentraut
Дата:
On 23.07.24 00:29, Tom Lane wrote:
> (Personally, I'd wonder exactly what ALL is quantified over: the
> whole output of the FROM clause, or only columns mentioned in the
> SELECT tlist, or what? And why that choice rather than another?)

Looks like the main existing implementations take it to mean all entries 
in the SELECT list that are not aggregate functions.

https://duckdb.org/docs/sql/query_syntax/groupby.html#group-by-all
https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-qry-select-groupby.html#parameters
https://docs.snowflake.com/en/sql-reference/constructs/group-by#parameters




Re: [PATCH] GROUP BY ALL

От
Jelte Fennema-Nio
Дата:
On Mon, 22 Jul 2024 at 22:55, David Christensen <david@pgguru.net> wrote:
> I see that there'd been some chatter but not a lot of discussion about
> a GROUP BY ALL feature/functionality.  There certainly is utility in
> such a construct IMHO.

+1 from me. When exploring data, this is extremely useful because you
don't have to update the GROUP BY clause every time

Regarding the arguments against this:
1. I don't think this is any more unreadable than being able to GROUP
BY 1, 2, 3. Or being able to use column aliases from the SELECT in the
GROUP BY clause. Again this is already allowed. Personally I actually
think it's more readable than specifying e.g. 5 columns in the group
by, because then I have to cross-reference with columns in the SELECT
clause to find out if they are the same. With ALL I instantly know
it's grouped by all
2. This is indeed not part of the standard. But we have many things
that are not part of the standard. I think as long as we use the same
syntax as snowflake, databricks and duckdb I personally don't see a
big problem. Then we could try and make this be part of the standard
in the next version of the standard.



Re: [PATCH] GROUP BY ALL

От
Jelte Fennema-Nio
Дата:
On Tue, 23 Jul 2024 at 15:22, Andrei Borodin <x4mmm@yandex-team.ru> wrote:
> I'd like to have GROUP BY AUTO (I also proposed version GROUP BY SURPRISE ME). But I wouldn't like to open pandora
boxof syntax sugar extensions which may will be incompatible with future standards.
 
> If we could have extensible grammar - I'd be happy to have a lot of such enhancements. My top 2 are FROM table SELECT
columnand better GROUP BY.
 

Personally my number one enhancement would be allowing a trailing
comma after the last column in the SELECT clause.



Re: [PATCH] GROUP BY ALL

От
Pavel Stehule
Дата:
Hi

st 24. 7. 2024 v 10:57 odesílatel Jelte Fennema-Nio <postgres@jeltef.nl> napsal:
On Mon, 22 Jul 2024 at 22:55, David Christensen <david@pgguru.net> wrote:
> I see that there'd been some chatter but not a lot of discussion about
> a GROUP BY ALL feature/functionality.  There certainly is utility in
> such a construct IMHO.

+1 from me. When exploring data, this is extremely useful because you
don't have to update the GROUP BY clause every time

Regarding the arguments against this:
1. I don't think this is any more unreadable than being able to GROUP
BY 1, 2, 3. Or being able to use column aliases from the SELECT in the
GROUP BY clause. Again this is already allowed. Personally I actually
think it's more readable than specifying e.g. 5 columns in the group
by, because then I have to cross-reference with columns in the SELECT
clause to find out if they are the same. With ALL I instantly know
it's grouped by all
2. This is indeed not part of the standard. But we have many things
that are not part of the standard. I think as long as we use the same
syntax as snowflake, databricks and duckdb I personally don't see a
big problem. Then we could try and make this be part of the standard
in the next version of the standard.

 Aggregation against more columns is pretty slow and memory expensive in Postgres.

DuckDB is an analytic database with different storage, different executors. I like it very much, but I am not sure if I want to see these features in Postgres.

Lot of developers are not very smart, and with proposed feature, then instead to write correct and effective query, then they use just GROUP BY ALL. Slow query should look like a slow query :-)

Regards

Pavel


Re: [PATCH] GROUP BY ALL

От
"Andrey M. Borodin"
Дата:

> On 24 Jul 2024, at 13:58, Jelte Fennema-Nio <postgres@jeltef.nl> wrote:
>
> On Tue, 23 Jul 2024 at 15:22, Andrei Borodin <x4mmm@yandex-team.ru> wrote:
>> I'd like to have GROUP BY AUTO (I also proposed version GROUP BY SURPRISE ME). But I wouldn't like to open pandora
boxof syntax sugar extensions which may will be incompatible with future standards. 
>> If we could have extensible grammar - I'd be happy to have a lot of such enhancements. My top 2 are FROM table
SELECTcolumn and better GROUP BY. 
>
> Personally my number one enhancement would be allowing a trailing
> comma after the last column in the SELECT clause.

Yes, trailing comma sounds great too.

One more similar syntax sugar I can think of. I see lots of queries like
SELECT somtheing
FROM table1
WHERE 1=1
and id = x
--and col1 = val1
and col2 = val2

I was wondering where does that "1=1" comes from. It's because developer comment condition one by one like "--, col1 =
val1".And they do not want to cope with and\or continuation. 


Best regards, Andrey Borodin.

PS. Seems like Mail.App mangled my previous message despite using plain text. It's totally broken in archives...


Re: [PATCH] GROUP BY ALL

От
Ashutosh Bapat
Дата:
On Tue, Jul 23, 2024 at 6:53 PM David Christensen <david@pgguru.net> wrote:
>
> On Mon, Jul 22, 2024 at 4:34 PM David G. Johnston
> <david.g.johnston@gmail.com> wrote:
> >
> > On Mon, Jul 22, 2024 at 1:55 PM David Christensen <david@pgguru.net> wrote:
> >>
> >> I see that there'd been some chatter but not a lot of discussion about
> >> a GROUP BY ALL feature/functionality.  There certainly is utility in
> >> such a construct IMHO.
> >>
> >> Still need some docs; just throwing this out there and getting some feedback.
> >>
> >
> > I strongly dislike adding this feature.  I'd only consider supporting it if it was part of the SQL standard.
> >
> > Code is written once and read many times.  This feature caters to the writer, not the reader.  And furthermore
usageof this is prone to be to the writer's detriment as well. 
>
> I'd say this feature (at least for me) caters to the investigator;
> someone who is interactively looking at data hence why it would cater
> to the writer.  Consider acquainting yourself with a large table that
> has a large number of annoying-named fields where you want to look at
> how different data is correlated or broken-down.  Something along the
> lines of:

To me this looks like a feature that a data exploration tool may
implement instead of being part of the server. It would then provide
more statistics about each correlation/column set etc.

--
Best Wishes,
Ashutosh Bapat



Re: [PATCH] GROUP BY ALL

От
Greg Sabino Mullane
Дата:
On Tue, Jul 23, 2024 at 9:37 AM David Christensen <david@pgguru.net> wrote:
I'm not married to the exact syntax of this feature; anything else short and consistent could work if `ALL` is considered to potentially
gain a different interpretation in the future.

GROUP BY *

Just kidding. But a big +1 to the whole concept. It would have been extraordinarily useful over the years.

Cheers,
Greg

Re: [PATCH] GROUP BY ALL

От
Jelte Fennema-Nio
Дата:
On Tue, 23 Jul 2024 at 22:02, Peter Eisentraut <peter@eisentraut.org> wrote:
> Looks like the main existing implementations take it to mean all entries
> in the SELECT list that are not aggregate functions.
>
> https://duckdb.org/docs/sql/query_syntax/groupby.html#group-by-all
> https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-qry-select-groupby.html#parameters
> https://docs.snowflake.com/en/sql-reference/constructs/group-by#parameters

Oracle added support for GROUP BY ALL too now:
https://danischnider.wordpress.com/2025/08/05/oracle-23-9-supports-group-by-all/



Re: [PATCH] GROUP BY ALL

От
Peter Eisentraut
Дата:
On 17.08.25 19:12, Jelte Fennema-Nio wrote:
> On Tue, 23 Jul 2024 at 22:02, Peter Eisentraut <peter@eisentraut.org> wrote:
>> Looks like the main existing implementations take it to mean all entries
>> in the SELECT list that are not aggregate functions.
>>
>> https://duckdb.org/docs/sql/query_syntax/groupby.html#group-by-all
>> https://docs.databricks.com/en/sql/language-manual/sql-ref-syntax-qry-select-groupby.html#parameters
>> https://docs.snowflake.com/en/sql-reference/constructs/group-by#parameters
> 
> Oracle added support for GROUP BY ALL too now:
> https://danischnider.wordpress.com/2025/08/05/oracle-23-9-supports-group-by-all/

The proposal for GROUP BY ALL was accepted into the SQL standard draft 
yesterday.  So maybe someone wants to take this up again.

The initially proposed patch appears to have the right idea overall. 
But it does not handle more complex cases like

     SELECT a, SUM(b)+a FROM t1 GROUP BY ALL;

correctly.  The piece of code that does

     if (!IsA(n->expr,Aggref))

should be generalized to check for aggregates not only at the top level.

(For explanation:  GROUP BY ALL expands to all select list entries that 
do not contain aggregates.  So the above would expand to

     SELECT a, SUM(b)+a FROM t1 GROUP BY a;

which should then be rejected based on the existing rules.)



Re: [PATCH] GROUP BY ALL

От
Andrey Borodin
Дата:

> On 26 Sep 2025, at 12:02, Peter Eisentraut <peter@eisentraut.org> wrote:
>
> maybe someone wants to take this up again.

I've compared my old patch with David's, and his version looks better.
If David is not going to work on this in nearest future - I'l like to pick up the work.
Either way I'll join as reviewer.

Thanks!


Best regards, Andrey Borodin.


Re: [PATCH] GROUP BY ALL

От
David Christensen
Дата:

> On Sep 26, 2025, at 3:45 AM, Andrey Borodin <x4mmm@yandex-team.ru> wrote:
>
> 
>
>> On 26 Sep 2025, at 12:02, Peter Eisentraut <peter@eisentraut.org> wrote:
>>
>> maybe someone wants to take this up again.
>
> I've compared my old patch with David's, and his version looks better.
> If David is not going to work on this in nearest future - I'l like to pick up the work.
> Either way I'll join as reviewer.
>
> Thanks!

I’m interested in picking it up again but would appreciate the review.

Thanks,

David


Re: [PATCH] GROUP BY ALL

От
Tom Lane
Дата:
Peter Eisentraut <peter@eisentraut.org> writes:
> The initially proposed patch appears to have the right idea overall. 
> But it does not handle more complex cases like
>      SELECT a, SUM(b)+a FROM t1 GROUP BY ALL;

> (For explanation:  GROUP BY ALL expands to all select list entries that 
> do not contain aggregates.  So the above would expand to
>      SELECT a, SUM(b)+a FROM t1 GROUP BY a;
> which should then be rejected based on the existing rules.)

I thought I understood this definition, up till your last
comment.  What's invalid about that expanded query?

regression=# create table t1 (a int, b int);
CREATE TABLE
regression=# SELECT a, SUM(b)+a FROM t1 GROUP BY a;
 a | ?column? 
---+----------
(0 rows)


            regards, tom lane



Re: [PATCH] GROUP BY ALL

От
David Christensen
Дата:
On Fri, Sep 26, 2025 at 6:16 AM David Christensen <david@pgguru.net> wrote:
>
>
>
> > On Sep 26, 2025, at 3:45 AM, Andrey Borodin <x4mmm@yandex-team.ru> wrote:
> >
> > 
> >
> >> On 26 Sep 2025, at 12:02, Peter Eisentraut <peter@eisentraut.org> wrote:
> >>
> >> maybe someone wants to take this up again.
> >
> > I've compared my old patch with David's, and his version looks better.
> > If David is not going to work on this in nearest future - I'l like to pick up the work.
> > Either way I'll join as reviewer.
> >
> > Thanks!
>
> I’m interested in picking it up again but would appreciate the review.

Here is a rebased version with a few more tests.  I also changed the
main check here to using `!contain_agg_clause` instead of
`!IsA(Aggref))` directly.  (This was defined in `optimizer/clauses.h`,
but we already are pulling in `optimizer.h`, so it felt valid to me.)

David

Вложения

Re: [PATCH] GROUP BY ALL

От
David Christensen
Дата:

Re: [PATCH] GROUP BY ALL

От
David Christensen
Дата:
On Fri, Sep 26, 2025 at 9:12 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Peter Eisentraut <peter@eisentraut.org> writes:
> > The initially proposed patch appears to have the right idea overall.
> > But it does not handle more complex cases like
> >      SELECT a, SUM(b)+a FROM t1 GROUP BY ALL;
>
> > (For explanation:  GROUP BY ALL expands to all select list entries that
> > do not contain aggregates.  So the above would expand to
> >      SELECT a, SUM(b)+a FROM t1 GROUP BY a;
> > which should then be rejected based on the existing rules.)
>
> I thought I understood this definition, up till your last
> comment.  What's invalid about that expanded query?
>
> regression=# create table t1 (a int, b int);
> CREATE TABLE
> regression=# SELECT a, SUM(b)+a FROM t1 GROUP BY a;
>  a | ?column?
> ---+----------
> (0 rows)

Agreed that this shouldn't be an error; added a similar test case to
v2 of this patch.

David



Re: [PATCH] GROUP BY ALL

От
Peter Eisentraut
Дата:
On 26.09.25 16:11, Tom Lane wrote:
> Peter Eisentraut <peter@eisentraut.org> writes:
>> The initially proposed patch appears to have the right idea overall.
>> But it does not handle more complex cases like
>>       SELECT a, SUM(b)+a FROM t1 GROUP BY ALL;
> 
>> (For explanation:  GROUP BY ALL expands to all select list entries that
>> do not contain aggregates.  So the above would expand to
>>       SELECT a, SUM(b)+a FROM t1 GROUP BY a;
>> which should then be rejected based on the existing rules.)
> 
> I thought I understood this definition, up till your last
> comment.  What's invalid about that expanded query?
> 
> regression=# create table t1 (a int, b int);
> CREATE TABLE
> regression=# SELECT a, SUM(b)+a FROM t1 GROUP BY a;
>   a | ?column?
> ---+----------
> (0 rows)

This was a sloppy example.  Here is a better one:

     create table t1 (a int, b int, c int);

     select a, sum(b)+c from t1 group by all;

This is equivalent to

     select a, sum(b)+c from t1 group by a;

which would be rejected as

     ERROR:  column "t1.c" must appear in the GROUP BY clause or be used
     in an aggregate function




Re: [PATCH] GROUP BY ALL

От
David Christensen
Дата:
On Fri, Sep 26, 2025 at 10:54 AM Peter Eisentraut <peter@eisentraut.org> wrote:
>
> On 26.09.25 16:11, Tom Lane wrote:
> > Peter Eisentraut <peter@eisentraut.org> writes:
> >> The initially proposed patch appears to have the right idea overall.
> >> But it does not handle more complex cases like
> >>       SELECT a, SUM(b)+a FROM t1 GROUP BY ALL;
> >
> >> (For explanation:  GROUP BY ALL expands to all select list entries that
> >> do not contain aggregates.  So the above would expand to
> >>       SELECT a, SUM(b)+a FROM t1 GROUP BY a;
> >> which should then be rejected based on the existing rules.)
> >
> > I thought I understood this definition, up till your last
> > comment.  What's invalid about that expanded query?
> >
> > regression=# create table t1 (a int, b int);
> > CREATE TABLE
> > regression=# SELECT a, SUM(b)+a FROM t1 GROUP BY a;
> >   a | ?column?
> > ---+----------
> > (0 rows)
>
> This was a sloppy example.  Here is a better one:
>
>      create table t1 (a int, b int, c int);
>
>      select a, sum(b)+c from t1 group by all;
>
> This is equivalent to
>
>      select a, sum(b)+c from t1 group by a;
>
> which would be rejected as
>
>      ERROR:  column "t1.c" must appear in the GROUP BY clause or be used
>      in an aggregate function

Verified that with v2 that this is what happens in this case; will
include this and whatever other feedback there is in a v3.



Re: [PATCH] GROUP BY ALL

От
Tom Lane
Дата:
David Christensen <david@pgguru.net> writes:
> Here is a rebased version with a few more tests.  I also changed the
> main check here to using `!contain_agg_clause` instead of
> `!IsA(Aggref))` directly.  (This was defined in `optimizer/clauses.h`,
> but we already are pulling in `optimizer.h`, so it felt valid to me.)

contain_agg_clause will blow up on a SubLink, so I doubt this is
gonna be robust.

            regards, tom lane



Re: [PATCH] GROUP BY ALL

От
David Christensen
Дата:
On Fri, Sep 26, 2025 at 11:05 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> David Christensen <david@pgguru.net> writes:
> > Here is a rebased version with a few more tests.  I also changed the
> > main check here to using `!contain_agg_clause` instead of
> > `!IsA(Aggref))` directly.  (This was defined in `optimizer/clauses.h`,
> > but we already are pulling in `optimizer.h`, so it felt valid to me.)
>
> contain_agg_clause will blow up on a SubLink, so I doubt this is
> gonna be robust.

Fair enough, see that Assert now; easy enough to make a new
expression_tree_walker that only looks for Aggref and short-circuits
SubLink (which I assume is the right behavior here, but might have to
add some more tests/play around with subqueries in the GROUP BY ALL
part).

David



Re: [PATCH] GROUP BY ALL

От
David Christensen
Дата:
On Fri, Sep 26, 2025 at 11:13 AM David Christensen <david@pgguru.net> wrote:
>
> On Fri, Sep 26, 2025 at 11:05 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >
> > David Christensen <david@pgguru.net> writes:
> > > Here is a rebased version with a few more tests.  I also changed the
> > > main check here to using `!contain_agg_clause` instead of
> > > `!IsA(Aggref))` directly.  (This was defined in `optimizer/clauses.h`,
> > > but we already are pulling in `optimizer.h`, so it felt valid to me.)
> >
> > contain_agg_clause will blow up on a SubLink, so I doubt this is
> > gonna be robust.
>
> Fair enough, see that Assert now; easy enough to make a new
> expression_tree_walker that only looks for Aggref and short-circuits
> SubLink (which I assume is the right behavior here, but might have to
> add some more tests/play around with subqueries in the GROUP BY ALL
> part).

Or contain_aggs_of_level(), assuming I can suss out how to get the
current level... :D  Anyway, will mess with this for a bit.

David



Re: [PATCH] GROUP BY ALL

От
Tom Lane
Дата:
David Christensen <david@pgguru.net> writes:
> On Fri, Sep 26, 2025 at 11:05 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> contain_agg_clause will blow up on a SubLink, so I doubt this is
>> gonna be robust.

> Fair enough, see that Assert now; easy enough to make a new
> expression_tree_walker that only looks for Aggref and short-circuits
> SubLink (which I assume is the right behavior here, but might have to
> add some more tests/play around with subqueries in the GROUP BY ALL
> part).

No, I think the correct behavior would have to be to descend into
SubLinks to see if they contain any aggregates belonging to the
outer query level.

However (looks around) we do already have that code.
See contain_aggs_of_level.  (contain_agg_clause is essentially
a simplified version that is okay to use in the planner because
it's already gotten rid of sublinks.)

What mainly concerns me at this point is whether we've identified
aggregate levels at the point in parsing where you want to run this.
I have a bit of a worry that that might interact with grouping.
Presumably the SQL committee thought about that, so it's probably
soluble, but ...

            regards, tom lane



Re: [PATCH] GROUP BY ALL

От
Tom Lane
Дата:
David Christensen <david@pgguru.net> writes:
> Or contain_aggs_of_level(), assuming I can suss out how to get the
> current level... :D  Anyway, will mess with this for a bit.

Current level is zero by definition ...

            regards, tom lane



Re: [PATCH] GROUP BY ALL

От
jian he
Дата:
On Fri, Sep 26, 2025 at 11:46 PM David Christensen <david@pgguru.net> wrote:
>
> >
> > I’m interested in picking it up again but would appreciate the review.
>
> Here is a rebased version with a few more tests.  I also changed the
> main check here to using `!contain_agg_clause` instead of
> `!IsA(Aggref))` directly.  (This was defined in `optimizer/clauses.h`,
> but we already are pulling in `optimizer.h`, so it felt valid to me.)
>

hi.
I only briefly browse the patch text file, so forgive me.
seems missing deparse regress tests

i think you may need one test like:

create view v1 as SELECT b, COUNT(*) FROM t1 GROUP BY ALL;
\sv v1



Re: [PATCH] GROUP BY ALL

От
David Christensen
Дата:
On Fri, Sep 26, 2025 at 11:38 AM jian he <jian.universality@gmail.com> wrote:
>
> On Fri, Sep 26, 2025 at 11:46 PM David Christensen <david@pgguru.net> wrote:
> >
> > >
> > > I’m interested in picking it up again but would appreciate the review.
> >
> > Here is a rebased version with a few more tests.  I also changed the
> > main check here to using `!contain_agg_clause` instead of
> > `!IsA(Aggref))` directly.  (This was defined in `optimizer/clauses.h`,
> > but we already are pulling in `optimizer.h`, so it felt valid to me.)
> >
>
> hi.
> I only briefly browse the patch text file, so forgive me.
> seems missing deparse regress tests
>
> i think you may need one test like:
>
> create view v1 as SELECT b, COUNT(*) FROM t1 GROUP BY ALL;
> \sv v1

Thanks, good suggestion; not sure the appropriate final location for
this, but it did show a bug in the current patch.

David



Re: [PATCH] GROUP BY ALL

От
David Christensen
Дата:
Version 3 of this patch, incorporating some of the feedback thus far:


On Fri, Sep 26, 2025 at 11:56 AM David Christensen <david@pgguru.net> wrote:
>
> On Fri, Sep 26, 2025 at 11:38 AM jian he <jian.universality@gmail.com> wrote:
> >
> > On Fri, Sep 26, 2025 at 11:46 PM David Christensen <david@pgguru.net> wrote:
> > >
> > > >
> > > > I’m interested in picking it up again but would appreciate the review.
> > >
> > > Here is a rebased version with a few more tests.  I also changed the
> > > main check here to using `!contain_agg_clause` instead of
> > > `!IsA(Aggref))` directly.  (This was defined in `optimizer/clauses.h`,
> > > but we already are pulling in `optimizer.h`, so it felt valid to me.)
> > >
> >
> > hi.
> > I only briefly browse the patch text file, so forgive me.
> > seems missing deparse regress tests
> >
> > i think you may need one test like:
> >
> > create view v1 as SELECT b, COUNT(*) FROM t1 GROUP BY ALL;
> > \sv v1
>
> Thanks, good suggestion; not sure the appropriate final location for
> this, but it did show a bug in the current patch.
>
> David

Вложения

Re: [PATCH] GROUP BY ALL

От
Tom Lane
Дата:
David Christensen <david@pgguru.net> writes:
> Version 3 of this patch, incorporating some of the feedback thus far:

Some random comments:

I don't love where you put the parsing code.  Instead of
exposing addTargetToGroupList, I think you should have given
transformGroupClause the responsibility of expanding GROUP BY ALL.
One reason for that is that GROUP BY processing depends on the
results of transformSortClause.  This means that in v3, the
behavior of

     SELECT x FROM ... GROUP BY x ORDER BY x;

will be subtly different from

     SELECT x FROM ... GROUP BY ALL ORDER BY x;

which seems like a bug, or at least not desirable.  (You might need a
DESC or USING decoration in the ORDER BY to expose this clearly.)

The parsing code itself is not great:

+            TargetEntry *n = (TargetEntry*)lfirst(l1);
+            if (!contain_aggs_of_level((Node *)n->expr, 0))
+                qry->groupClause = addTargetToGroupList(pstate, n, qry->groupClause, qry->targetList, 0);

You should be skipping resjunk entries, and please use "tle" or some
such name less generic than "n", and "0" is not the correct location
to pass to addTargetToGroupList.  Probably the location of the ALL
keyword would be the ideal thing, but if we don't have that, the
notation for "no location known" is -1 not 0.  We could also
consider using exprLocation(tle->expr), despite the comment on
addTargetToGroupList that that's not the right thing.  This might
actually be better than pointing at the ALL anyway, since that would
give no hint which targetentry caused the error.

The test cases seem poorly designed, because it's very hard to be
sure whether the code expanded the ALL as-expected.  I think you
could improve them by not executing the queries but just EXPLAINing
them, so that the expanded group-by list is directly visible.

            regards, tom lane



Re: [PATCH] GROUP BY ALL

От
David Christensen
Дата:
On Fri, Sep 26, 2025 at 12:46 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Some random comments:

[snip]

Great feedback, thanks; attached is v4 which should address these comments.

David

Вложения

Re: [PATCH] GROUP BY ALL

От
Tom Lane
Дата:
Peter Eisentraut <peter@eisentraut.org> writes:
> On 26.09.25 16:11, Tom Lane wrote:
>> I thought I understood this definition, up till your last
>> comment.  What's invalid about that expanded query?

> This was a sloppy example.  Here is a better one:
>      create table t1 (a int, b int, c int);
>      select a, sum(b)+c from t1 group by all;
> This is equivalent to
>      select a, sum(b)+c from t1 group by a;
> which would be rejected as
>      ERROR:  column "t1.c" must appear in the GROUP BY clause or be used
>      in an aggregate function

Got it, mostly.  There is an edge case, though: what if there are no
candidate grouping items?  I see these test cases in David's patch:

+-- oops all aggregates
+EXPLAIN (COSTS OFF) SELECT COUNT(a), SUM(b) FROM t1 GROUP BY ALL;
+      QUERY PLAN
+----------------------
+ Aggregate
+   ->  Seq Scan on t1
+(2 rows)
+
+-- empty column list
+EXPLAIN (COSTS OFF) SELECT FROM t1 GROUP BY ALL;
+   QUERY PLAN
+----------------
+ Seq Scan on t1
+(1 row)

That is, in such cases the patch behaves as if there were no GROUP BY
clause at all, which seems kinda dubious.  Should this be an error,
and if not what's it supposed to do?  The second case is outside the
SQL spec so I'm not expecting guidance on that, but surely the
committee thought about the first case.

Also, what about window functions in the tlist?  If you do

regression=# explain select sum(q1) over(partition by q2) from int8_tbl group by 1;

you get

ERROR:  window functions are not allowed in GROUP BY
LINE 1: explain select sum(q1) over(partition by q2) from int8_tbl g...
                       ^

but that's not what is happening with

regression=# explain select sum(q1) over(partition by q2) from int8_tbl group by all;
ERROR:  column "int8_tbl.q2" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: explain select sum(q1) over(partition by q2) from int8_tbl g...
                                                 ^

(I didn't stop to figure out why this isn't giving the same error, but
maybe it's an order-of-checks thing.)  In any case: should this give
"window functions are not allowed in GROUP BY", or should the
window-function-containing tlist item be silently skipped by GROUP BY
ALL?  Trying to make it work is surely not the right answer.

            regards, tom lane



Re: [PATCH] GROUP BY ALL

От
Tom Lane
Дата:
[ trimming the cc: list because gmail decided my last was spam ]

David Christensen <david@pgguru.net> writes:
> Great feedback, thanks; attached is v4 which should address these comments.

I did a pass of cleanup over this --- mostly cosmetic, but not
entirely.  Along the way I discovered a pre-existing bug:
transformSelectStmt does
     qry->groupDistinct = stmt->groupDistinct;
but transformPLAssignStmt fails to, with the result that GROUP BY
DISTINCT would misbehave if used in a plpgsql "expression" context.
I'm not hugely surprised that no one has reported that from the
field, but nonetheless it's broken.  In the attached v5 I just
quickly added the missing line and moved on, but we'll need to
back-patch that bit.  (Maybe we'd be well advised to refactor to
reduce the amount of duplicated code that needs to be kept in sync?)

I have not attempted to address the definitional issues I just
queried Peter about.  Other open items:

* Documentation

* The test cases deserve reconsideration now that we think their
charter only goes as far as EXPLAIN'ing the results; some of them
seem pretty redundant in this context.

            regards, tom lane

diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c
index b9763ea1714..fe632d3dabf 100644
--- a/src/backend/parser/analyze.c
+++ b/src/backend/parser/analyze.c
@@ -1440,12 +1440,14 @@ transformSelectStmt(ParseState *pstate, SelectStmt *stmt)

     qry->groupClause = transformGroupClause(pstate,
                                             stmt->groupClause,
+                                            stmt->groupAll,
                                             &qry->groupingSets,
                                             &qry->targetList,
                                             qry->sortClause,
                                             EXPR_KIND_GROUP_BY,
                                             false /* allow SQL92 rules */ );
     qry->groupDistinct = stmt->groupDistinct;
+    qry->groupAll = stmt->groupAll;

     if (stmt->distinctClause == NIL)
     {
@@ -2930,11 +2932,14 @@ transformPLAssignStmt(ParseState *pstate, PLAssignStmt *stmt)

     qry->groupClause = transformGroupClause(pstate,
                                             sstmt->groupClause,
+                                            sstmt->groupAll,
                                             &qry->groupingSets,
                                             &qry->targetList,
                                             qry->sortClause,
                                             EXPR_KIND_GROUP_BY,
                                             false /* allow SQL92 rules */ );
+    qry->groupDistinct = sstmt->groupDistinct;
+    qry->groupAll = sstmt->groupAll;

     if (sstmt->distinctClause == NIL)
     {
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 9fd48acb1f8..8a0aa1899e2 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -120,6 +120,7 @@ typedef struct SelectLimit
 typedef struct GroupClause
 {
     bool        distinct;
+    bool        all;
     List       *list;
 } GroupClause;

@@ -12993,6 +12994,7 @@ simple_select:
                     n->whereClause = $6;
                     n->groupClause = ($7)->list;
                     n->groupDistinct = ($7)->distinct;
+                    n->groupAll = ($7)->all;
                     n->havingClause = $8;
                     n->windowClause = $9;
                     $$ = (Node *) n;
@@ -13010,6 +13012,7 @@ simple_select:
                     n->whereClause = $6;
                     n->groupClause = ($7)->list;
                     n->groupDistinct = ($7)->distinct;
+                    n->groupAll = ($7)->all;
                     n->havingClause = $8;
                     n->windowClause = $9;
                     $$ = (Node *) n;
@@ -13507,14 +13510,24 @@ group_clause:
                     GroupClause *n = (GroupClause *) palloc(sizeof(GroupClause));

                     n->distinct = $3 == SET_QUANTIFIER_DISTINCT;
+                    n->all = false;
                     n->list = $4;
                     $$ = n;
                 }
+            | GROUP_P BY ALL
+                {
+                    GroupClause *n = (GroupClause *) palloc(sizeof(GroupClause));
+                    n->distinct = false;
+                    n->all = true;
+                    n->list = NIL;
+                    $$ = n;
+                }
             | /*EMPTY*/
                 {
                     GroupClause *n = (GroupClause *) palloc(sizeof(GroupClause));

                     n->distinct = false;
+                    n->all = false;
                     n->list = NIL;
                     $$ = n;
                 }
@@ -17618,6 +17631,7 @@ PLpgSQL_Expr: opt_distinct_clause opt_target_list
                     n->whereClause = $4;
                     n->groupClause = ($5)->list;
                     n->groupDistinct = ($5)->distinct;
+                    n->groupAll = ($5)->all;
                     n->havingClause = $6;
                     n->windowClause = $7;
                     n->sortClause = $8;
diff --git a/src/backend/parser/parse_clause.c b/src/backend/parser/parse_clause.c
index 9f20a70ce13..562e3e90e34 100644
--- a/src/backend/parser/parse_clause.c
+++ b/src/backend/parser/parse_clause.c
@@ -2598,6 +2598,9 @@ transformGroupingSet(List **flatresult,
  * GROUP BY items will be added to the targetlist (as resjunk columns)
  * if not already present, so the targetlist must be passed by reference.
  *
+ * If GROUP BY ALL is specified, the groupClause will be inferred to be all
+ * non-aggregate expressions in the targetlist.
+ *
  * This is also used for window PARTITION BY clauses (which act almost the
  * same, but are always interpreted per SQL99 rules).
  *
@@ -2622,6 +2625,7 @@ transformGroupingSet(List **flatresult,
  *
  * pstate        ParseState
  * grouplist    clause to transform
+ * groupByAll    is this a GROUP BY ALL statement?
  * groupingSets reference to list to contain the grouping set tree
  * targetlist    reference to TargetEntry list
  * sortClause    ORDER BY clause (SortGroupClause nodes)
@@ -2629,7 +2633,8 @@ transformGroupingSet(List **flatresult,
  * useSQL99        SQL99 rather than SQL92 syntax
  */
 List *
-transformGroupClause(ParseState *pstate, List *grouplist, List **groupingSets,
+transformGroupClause(ParseState *pstate, List *grouplist, bool groupByAll,
+                     List **groupingSets,
                      List **targetlist, List *sortClause,
                      ParseExprKind exprKind, bool useSQL99)
 {
@@ -2640,6 +2645,41 @@ transformGroupClause(ParseState *pstate, List *grouplist, List **groupingSets,
     bool        hasGroupingSets = false;
     Bitmapset  *seen_local = NULL;

+    /* Handle GROUP BY ALL */
+    if (groupByAll)
+    {
+        /* There cannot have been any explicit list items */
+        Assert(grouplist == NIL);
+
+        /*
+         * Iterate over targets, any non-aggregate gets added as a Target.
+         * Note that it's not enough to check for a top-level Aggref; we need
+         * to ensure that any sub-expression here does not include an Aggref
+         * (for instance an expression such as `sum(col) + 4` should not be
+         * added as a grouping target).
+         *
+         * We specify the parse location as the TLE's location, despite the
+         * comment for addTargetToGroupList discouraging that.  The only other
+         * thing we could point to is the ALL keyword, which seems unhelpful
+         * when there are multiple TLEs.
+         */
+        foreach_ptr(TargetEntry, tle, *targetlist)
+        {
+            /* Ignore junk TLEs */
+            if (tle->resjunk)
+                continue;
+
+            /* Use contain_aggs_of_level to detect Aggrefs in SubLinks */
+            if (!(pstate->p_hasAggs &&
+                  contain_aggs_of_level((Node *) tle->expr, 0)))
+                result = addTargetToGroupList(pstate, tle,
+                                              result, *targetlist,
+                                              exprLocation((Node *) tle->expr));
+        }
+
+        return result;
+    }
+
     /*
      * Recursively flatten implicit RowExprs. (Technically this is only needed
      * for GROUP BY, per the syntax rules for grouping sets, but we do it
@@ -2818,6 +2858,7 @@ transformWindowDefinitions(ParseState *pstate,
                                           true /* force SQL99 rules */ );
         partitionClause = transformGroupClause(pstate,
                                                windef->partitionClause,
+                                               false /* not GROUP BY ALL */ ,
                                                NULL,
                                                targetlist,
                                                orderClause,
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index defcdaa8b34..d0c6d3b55bd 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -6186,7 +6186,9 @@ get_basic_select_query(Query *query, deparse_context *context)
         save_ingroupby = context->inGroupBy;
         context->inGroupBy = true;

-        if (query->groupingSets == NIL)
+        if (query->groupAll)
+            appendStringInfoString(buf, "ALL");
+        else if (query->groupingSets == NIL)
         {
             sep = "";
             foreach(l, query->groupClause)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 4ed14fc5b78..d379879ae80 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -214,7 +214,8 @@ typedef struct Query
     List       *returningList;    /* return-values list (of TargetEntry) */

     List       *groupClause;    /* a list of SortGroupClause's */
-    bool        groupDistinct;    /* is the group by clause distinct? */
+    bool        groupDistinct;    /* was GROUP BY DISTINCT used? */
+    bool        groupAll;        /* was GROUP BY ALL used? */

     List       *groupingSets;    /* a list of GroupingSet's if present */

@@ -2192,6 +2193,7 @@ typedef struct SelectStmt
     Node       *whereClause;    /* WHERE qualification */
     List       *groupClause;    /* GROUP BY clauses */
     bool        groupDistinct;    /* Is this GROUP BY DISTINCT? */
+    bool        groupAll;        /* Is this GROUP BY ALL? */
     Node       *havingClause;    /* HAVING conditional-expression */
     List       *windowClause;    /* WINDOW window_name AS (...), ... */

diff --git a/src/include/parser/parse_clause.h b/src/include/parser/parse_clause.h
index 3e9894926de..ede3903d1dd 100644
--- a/src/include/parser/parse_clause.h
+++ b/src/include/parser/parse_clause.h
@@ -26,6 +26,7 @@ extern Node *transformLimitClause(ParseState *pstate, Node *clause,
                                   ParseExprKind exprKind, const char *constructName,
                                   LimitOption limitOption);
 extern List *transformGroupClause(ParseState *pstate, List *grouplist,
+                                  bool groupByAll,
                                   List **groupingSets,
                                   List **targetlist, List *sortClause,
                                   ParseExprKind exprKind, bool useSQL99);
diff --git a/src/test/regress/expected/aggregates.out b/src/test/regress/expected/aggregates.out
index 1f24f6ffd1f..aeeb452ec26 100644
--- a/src/test/regress/expected/aggregates.out
+++ b/src/test/regress/expected/aggregates.out
@@ -1557,6 +1557,153 @@ drop table t2;
 drop table t3;
 drop table p_t1;
 --
+-- Test GROUP BY ALL
+--
+-- We don't care about the data here, just the proper transformation of the
+-- GROUP BY clause, so test some queries and verify the EXPLAIN plans.
+CREATE TEMP TABLE t1 (
+  a int,
+  b int,
+  c int
+);
+-- basic field check
+EXPLAIN (COSTS OFF) SELECT b, COUNT(*) FROM t1 GROUP BY ALL;
+      QUERY PLAN
+----------------------
+ HashAggregate
+   Group Key: b
+   ->  Seq Scan on t1
+(3 rows)
+
+-- throw a null in the values too
+EXPLAIN (COSTS OFF) SELECT a, COUNT(a) FROM t1 GROUP BY ALL;
+      QUERY PLAN
+----------------------
+ HashAggregate
+   Group Key: a
+   ->  Seq Scan on t1
+(3 rows)
+
+-- multiple columns, non-consecutive order
+EXPLAIN (COSTS OFF) SELECT a, SUM(b), b FROM t1 GROUP BY ALL;
+      QUERY PLAN
+----------------------
+ HashAggregate
+   Group Key: a, b
+   ->  Seq Scan on t1
+(3 rows)
+
+-- multi columns, no aggregate
+EXPLAIN (COSTS OFF) SELECT a + b FROM t1 GROUP BY ALL;
+      QUERY PLAN
+----------------------
+ HashAggregate
+   Group Key: (a + b)
+   ->  Seq Scan on t1
+(3 rows)
+
+-- non-top-level aggregate
+EXPLAIN (COSTS OFF) SELECT a, SUM(b) + 4 FROM t1 GROUP BY ALL;
+      QUERY PLAN
+----------------------
+ HashAggregate
+   Group Key: a
+   ->  Seq Scan on t1
+(3 rows)
+
+-- including grouped column
+EXPLAIN (COSTS OFF) SELECT a, SUM(b) + a FROM t1 GROUP BY ALL;
+      QUERY PLAN
+----------------------
+ HashAggregate
+   Group Key: a
+   ->  Seq Scan on t1
+(3 rows)
+
+-- oops all aggregates
+EXPLAIN (COSTS OFF) SELECT COUNT(a), SUM(b) FROM t1 GROUP BY ALL;
+      QUERY PLAN
+----------------------
+ Aggregate
+   ->  Seq Scan on t1
+(2 rows)
+
+-- empty column list
+EXPLAIN (COSTS OFF) SELECT FROM t1 GROUP BY ALL;
+   QUERY PLAN
+----------------
+ Seq Scan on t1
+(1 row)
+
+-- filter
+EXPLAIN (COSTS OFF) SELECT a, COUNT(a) FILTER(WHERE b = 2) FROM t1 GROUP BY ALL;
+      QUERY PLAN
+----------------------
+ HashAggregate
+   Group Key: a
+   ->  Seq Scan on t1
+(3 rows)
+
+-- all cols
+EXPLAIN (COSTS OFF) SELECT * FROM t1 GROUP BY ALL;
+      QUERY PLAN
+----------------------
+ HashAggregate
+   Group Key: a, b, c
+   ->  Seq Scan on t1
+(3 rows)
+
+EXPLAIN (COSTS OFF) SELECT *, count(*) FROM t1 GROUP BY ALL;
+      QUERY PLAN
+----------------------
+ HashAggregate
+   Group Key: a, b, c
+   ->  Seq Scan on t1
+(3 rows)
+
+-- expression without including aggregate columns
+EXPLAIN (COSTS OFF) SELECT a, SUM(b) + c FROM t1 GROUP BY ALL;
+ERROR:  column "t1.c" must appear in the GROUP BY clause or be used in an aggregate function
+LINE 1: EXPLAIN (COSTS OFF) SELECT a, SUM(b) + c FROM t1 GROUP BY AL...
+                                               ^
+-- subquery
+EXPLAIN (COSTS OFF) SELECT (SELECT a FROM t1 LIMIT 1), SUM(b) FROM t1 GROUP BY ALL;
+            QUERY PLAN
+-----------------------------------
+ GroupAggregate
+   InitPlan 1
+     ->  Limit
+           ->  Seq Scan on t1 t1_1
+   ->  Seq Scan on t1
+(5 rows)
+
+-- cte
+EXPLAIN (COSTS OFF) WITH a_query AS (SELECT a, SUM(b) AS sum_b FROM t1 GROUP BY ALL)
+SELECT AVG(a), sum_b  FROM a_query GROUP BY ALL;
+         QUERY PLAN
+----------------------------
+ HashAggregate
+   Group Key: sum(t1.b)
+   ->  HashAggregate
+         Group Key: t1.a
+         ->  Seq Scan on t1
+(5 rows)
+
+-- verify deparse
+CREATE VIEW v1 AS SELECT b, COUNT(*) FROM t1 GROUP BY ALL;
+NOTICE:  view "v1" will be a temporary view
+SELECT pg_get_viewdef('v1'::regclass);
+    pg_get_viewdef
+-----------------------
+  SELECT b,           +
+     count(*) AS count+
+    FROM t1           +
+   GROUP BY ALL;
+(1 row)
+
+DROP VIEW v1;
+DROP TABLE t1;
+--
 -- Test GROUP BY matching of join columns that are type-coerced due to USING
 --
 create temp table t1(f1 int, f2 int);
diff --git a/src/test/regress/sql/aggregates.sql b/src/test/regress/sql/aggregates.sql
index 62540b1ffa4..6b2516cd9b1 100644
--- a/src/test/regress/sql/aggregates.sql
+++ b/src/test/regress/sql/aggregates.sql
@@ -549,6 +549,68 @@ drop table t2;
 drop table t3;
 drop table p_t1;

+
+--
+-- Test GROUP BY ALL
+--
+
+-- We don't care about the data here, just the proper transformation of the
+-- GROUP BY clause, so test some queries and verify the EXPLAIN plans.
+
+CREATE TEMP TABLE t1 (
+  a int,
+  b int,
+  c int
+);
+
+-- basic field check
+EXPLAIN (COSTS OFF) SELECT b, COUNT(*) FROM t1 GROUP BY ALL;
+
+-- throw a null in the values too
+EXPLAIN (COSTS OFF) SELECT a, COUNT(a) FROM t1 GROUP BY ALL;
+
+-- multiple columns, non-consecutive order
+EXPLAIN (COSTS OFF) SELECT a, SUM(b), b FROM t1 GROUP BY ALL;
+
+-- multi columns, no aggregate
+EXPLAIN (COSTS OFF) SELECT a + b FROM t1 GROUP BY ALL;
+
+-- non-top-level aggregate
+EXPLAIN (COSTS OFF) SELECT a, SUM(b) + 4 FROM t1 GROUP BY ALL;
+
+-- including grouped column
+EXPLAIN (COSTS OFF) SELECT a, SUM(b) + a FROM t1 GROUP BY ALL;
+
+-- oops all aggregates
+EXPLAIN (COSTS OFF) SELECT COUNT(a), SUM(b) FROM t1 GROUP BY ALL;
+
+-- empty column list
+EXPLAIN (COSTS OFF) SELECT FROM t1 GROUP BY ALL;
+
+-- filter
+EXPLAIN (COSTS OFF) SELECT a, COUNT(a) FILTER(WHERE b = 2) FROM t1 GROUP BY ALL;
+
+-- all cols
+EXPLAIN (COSTS OFF) SELECT * FROM t1 GROUP BY ALL;
+EXPLAIN (COSTS OFF) SELECT *, count(*) FROM t1 GROUP BY ALL;
+
+-- expression without including aggregate columns
+EXPLAIN (COSTS OFF) SELECT a, SUM(b) + c FROM t1 GROUP BY ALL;
+
+-- subquery
+EXPLAIN (COSTS OFF) SELECT (SELECT a FROM t1 LIMIT 1), SUM(b) FROM t1 GROUP BY ALL;
+
+-- cte
+EXPLAIN (COSTS OFF) WITH a_query AS (SELECT a, SUM(b) AS sum_b FROM t1 GROUP BY ALL)
+SELECT AVG(a), sum_b  FROM a_query GROUP BY ALL;
+
+-- verify deparse
+CREATE VIEW v1 AS SELECT b, COUNT(*) FROM t1 GROUP BY ALL;
+SELECT pg_get_viewdef('v1'::regclass);
+
+DROP VIEW v1;
+DROP TABLE t1;
+
 --
 -- Test GROUP BY matching of join columns that are type-coerced due to USING
 --

Re: [PATCH] GROUP BY ALL

От
Peter Eisentraut
Дата:
On 26.09.25 22:18, Tom Lane wrote:
> Got it, mostly.  There is an edge case, though: what if there are no
> candidate grouping items?  I see these test cases in David's patch:
> 
> +-- oops all aggregates
> +EXPLAIN (COSTS OFF) SELECT COUNT(a), SUM(b) FROM t1 GROUP BY ALL;
> +      QUERY PLAN
> +----------------------
> + Aggregate
> +   ->  Seq Scan on t1
> +(2 rows)
> +
> +-- empty column list
> +EXPLAIN (COSTS OFF) SELECT FROM t1 GROUP BY ALL;
> +   QUERY PLAN
> +----------------
> + Seq Scan on t1
> +(1 row)
> 
> That is, in such cases the patch behaves as if there were no GROUP BY
> clause at all, which seems kinda dubious.  Should this be an error,
> and if not what's it supposed to do?

These should resolve to GROUP BY ().

> Also, what about window functions in the tlist?

> (I didn't stop to figure out why this isn't giving the same error, but
> maybe it's an order-of-checks thing.)  In any case: should this give
> "window functions are not allowed in GROUP BY", or should the
> window-function-containing tlist item be silently skipped by GROUP BY
> ALL?  Trying to make it work is surely not the right answer.

Hmm, I don't know.  The syntactic transformation talks about select list 
elements that "do not directly contain an <aggregate function>", but 
that can also appear as part of <window function>, so the syntactic 
transformation might appear to apply only to some types of window 
functions, which doesn't make sense to me.

I don't know what a sensible behavior should be here.  Maybe in this 
first patch version just reject use of GROUP BY ALL if you find any 
window functions in the select list.




Re: [PATCH] GROUP BY ALL

От
Peter Eisentraut
Дата:
On 26.09.25 18:23, Tom Lane wrote:
> No, I think the correct behavior would have to be to descend into
> SubLinks to see if they contain any aggregates belonging to the
> outer query level.
> 
> However (looks around) we do already have that code.
> See contain_aggs_of_level.  (contain_agg_clause is essentially
> a simplified version that is okay to use in the planner because
> it's already gotten rid of sublinks.)
> 
> What mainly concerns me at this point is whether we've identified
> aggregate levels at the point in parsing where you want to run this.
> I have a bit of a worry that that might interact with grouping.
> Presumably the SQL committee thought about that, so it's probably
> soluble, but ...

The language used in the standard at the moment is the select list 
elements that "do not directly contain an <aggregate function>", where 
"directly contain" is a term of art that means "contains without an 
intervening instance of <subquery>, <within group specification>, or 
<set function specification> that is not an <ordered set function>".  So 
it means not to look into subqueries.

Note that in standard SQL, the GROUP BY clause can only contain plain 
column references, not expressions, so this question is kind of moot in 
that context, because the query would be invalid no matter whether you 
transform the GROUP BY ALL to group by the subquery or not.

For this first patch version, I suggest you reject the use of GROUP BY 
ALL if you find a subquery in the select list, unless you have an 
unambiguous better solution.

(It was discussed to relax this restriction, so this discussion is 
useful to collect some questions related to that.)




Re: [PATCH] GROUP BY ALL

От
Tom Lane
Дата:
Peter Eisentraut <peter@eisentraut.org> writes:
> The language used in the standard at the moment is the select list 
> elements that "do not directly contain an <aggregate function>", where 
> "directly contain" is a term of art that means "contains without an 
> intervening instance of <subquery>, <within group specification>, or 
> <set function specification> that is not an <ordered set function>".  So 
> it means not to look into subqueries.

TBH, that is obvious nonsense.  A subquery could contain an aggregate
function that we've already identified as being of the current query
level.  Putting such a construct into the GROUP BY list would create
an invalid query (cf. checkTargetlistEntrySQL92).  Similarly, putting
a window function into the GROUP BY list would create an invalid
query.

> Note that in standard SQL, the GROUP BY clause can only contain plain 
> column references, not expressions, so this question is kind of moot in 
> that context, because the query would be invalid no matter whether you 
> transform the GROUP BY ALL to group by the subquery or not.

So according to the standard, this:

    select a+b, count(*) from ... group by all;

would be invalid because a+b couldn't be written directly in
GROUP BY?  I can't see us rejecting that though, since we do
allow a+b in GROUP BY.

Seems like we're getting very little help from the standard as to
what this construct actually means.  I suggest that we ignore the
current draft as not having been thought through quite enough yet,
and make ALL skip any tlist entries that contain_aggs_of_level
zero or contain_windowfuncs.  If that means we're extending the
standard, so be it --- we've already extended GROUP BY quite a lot,
it seems.

            regards, tom lane



Re: [PATCH] GROUP BY ALL

От
Vik Fearing
Дата:
On 27/09/2025 18:03, Tom Lane wrote:
> So according to the standard, this:
>
>     select a+b, count(*) from ... group by all;
>
> would be invalid because a+b couldn't be written directly in
> GROUP BY?


Correct.


> I can't see us rejecting that though, since we do
> allow a+b in GROUP BY.


No, nor do I.  Also, there were rumors about adding expressions to group 
by in the standard but no formal proposal yet.


FWIW, I opposed adding this GROUP BY ALL feature, but I was 
outnumbered.  I hope to never see it in production.

-- 

Vik Fearing




Re: [PATCH] GROUP BY ALL

От
Tom Lane
Дата:
Here's a v6 that's rebased up to HEAD and contains fixes for the
semantic issues we discussed.  It still lacks documentation, but
otherwise I think it's about ready to go.

            regards, tom lane

From 5503815351b2e45661262fddaab1271a435dd730 Mon Sep 17 00:00:00 2001
From: Tom Lane <tgl@sss.pgh.pa.us>
Date: Sat, 27 Sep 2025 18:19:16 -0400
Subject: [PATCH v6] Add GROUP BY ALL.

GROUP BY ALL is a form of GROUP BY that adds any TargetExpr that does
not contain an aggregate or window function into the groupClause of
the query, making it exactly equivalent to specifying those same
expressions in an explicit GROUP BY list.

This feature is useful for certain kinds of data exploration.  It's
already present in some other DBMSes, and the SQL committee recently
accepted it into the standard, so we can be reasonably confident in
the syntax being stable.  We do have to invent part of the semantics,
as the standard doesn't allow for expressions in GROUP BY, so they
haven't specified what to do with window functions.  We assume that
those should be treated like aggregates, i.e., left out of the
constructed GROUP BY list.

Author: David Christensen <david@pgguru.net>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAHM0NXjz0kDwtzoe-fnHAqPB1qA8_VJN0XAmCgUZ+iPnvP5LbA@mail.gmail.com
---
 src/backend/parser/analyze.c             |   2 +
 src/backend/parser/gram.y                |  14 +++
 src/backend/parser/parse_clause.c        |  65 ++++++++++++-
 src/backend/utils/adt/ruleutils.c        |   4 +-
 src/include/nodes/parsenodes.h           |   4 +-
 src/include/parser/parse_clause.h        |   1 +
 src/test/regress/expected/aggregates.out | 113 +++++++++++++++++++++++
 src/test/regress/sql/aggregates.sql      |  50 ++++++++++
 8 files changed, 250 insertions(+), 3 deletions(-)

diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c
index 5aeb54eb5f6..99ba043961b 100644
--- a/src/backend/parser/analyze.c
+++ b/src/backend/parser/analyze.c
@@ -1467,12 +1467,14 @@ transformSelectStmt(ParseState *pstate, SelectStmt *stmt,

     qry->groupClause = transformGroupClause(pstate,
                                             stmt->groupClause,
+                                            stmt->groupAll,
                                             &qry->groupingSets,
                                             &qry->targetList,
                                             qry->sortClause,
                                             EXPR_KIND_GROUP_BY,
                                             false /* allow SQL92 rules */ );
     qry->groupDistinct = stmt->groupDistinct;
+    qry->groupAll = stmt->groupAll;

     if (stmt->distinctClause == NIL)
     {
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 9fd48acb1f8..8a0aa1899e2 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -120,6 +120,7 @@ typedef struct SelectLimit
 typedef struct GroupClause
 {
     bool        distinct;
+    bool        all;
     List       *list;
 } GroupClause;

@@ -12993,6 +12994,7 @@ simple_select:
                     n->whereClause = $6;
                     n->groupClause = ($7)->list;
                     n->groupDistinct = ($7)->distinct;
+                    n->groupAll = ($7)->all;
                     n->havingClause = $8;
                     n->windowClause = $9;
                     $$ = (Node *) n;
@@ -13010,6 +13012,7 @@ simple_select:
                     n->whereClause = $6;
                     n->groupClause = ($7)->list;
                     n->groupDistinct = ($7)->distinct;
+                    n->groupAll = ($7)->all;
                     n->havingClause = $8;
                     n->windowClause = $9;
                     $$ = (Node *) n;
@@ -13507,14 +13510,24 @@ group_clause:
                     GroupClause *n = (GroupClause *) palloc(sizeof(GroupClause));

                     n->distinct = $3 == SET_QUANTIFIER_DISTINCT;
+                    n->all = false;
                     n->list = $4;
                     $$ = n;
                 }
+            | GROUP_P BY ALL
+                {
+                    GroupClause *n = (GroupClause *) palloc(sizeof(GroupClause));
+                    n->distinct = false;
+                    n->all = true;
+                    n->list = NIL;
+                    $$ = n;
+                }
             | /*EMPTY*/
                 {
                     GroupClause *n = (GroupClause *) palloc(sizeof(GroupClause));

                     n->distinct = false;
+                    n->all = false;
                     n->list = NIL;
                     $$ = n;
                 }
@@ -17618,6 +17631,7 @@ PLpgSQL_Expr: opt_distinct_clause opt_target_list
                     n->whereClause = $4;
                     n->groupClause = ($5)->list;
                     n->groupDistinct = ($5)->distinct;
+                    n->groupAll = ($5)->all;
                     n->havingClause = $6;
                     n->windowClause = $7;
                     n->sortClause = $8;
diff --git a/src/backend/parser/parse_clause.c b/src/backend/parser/parse_clause.c
index 9f20a70ce13..ca26f6f61f2 100644
--- a/src/backend/parser/parse_clause.c
+++ b/src/backend/parser/parse_clause.c
@@ -2598,6 +2598,9 @@ transformGroupingSet(List **flatresult,
  * GROUP BY items will be added to the targetlist (as resjunk columns)
  * if not already present, so the targetlist must be passed by reference.
  *
+ * If GROUP BY ALL is specified, the groupClause will be inferred to be all
+ * non-aggregate, non-window expressions in the targetlist.
+ *
  * This is also used for window PARTITION BY clauses (which act almost the
  * same, but are always interpreted per SQL99 rules).
  *
@@ -2622,6 +2625,7 @@ transformGroupingSet(List **flatresult,
  *
  * pstate        ParseState
  * grouplist    clause to transform
+ * groupByAll    is this a GROUP BY ALL statement?
  * groupingSets reference to list to contain the grouping set tree
  * targetlist    reference to TargetEntry list
  * sortClause    ORDER BY clause (SortGroupClause nodes)
@@ -2629,7 +2633,8 @@ transformGroupingSet(List **flatresult,
  * useSQL99        SQL99 rather than SQL92 syntax
  */
 List *
-transformGroupClause(ParseState *pstate, List *grouplist, List **groupingSets,
+transformGroupClause(ParseState *pstate, List *grouplist, bool groupByAll,
+                     List **groupingSets,
                      List **targetlist, List *sortClause,
                      ParseExprKind exprKind, bool useSQL99)
 {
@@ -2640,6 +2645,63 @@ transformGroupClause(ParseState *pstate, List *grouplist, List **groupingSets,
     bool        hasGroupingSets = false;
     Bitmapset  *seen_local = NULL;

+    /* Handle GROUP BY ALL */
+    if (groupByAll)
+    {
+        /* There cannot have been any explicit grouplist items */
+        Assert(grouplist == NIL);
+
+        /* Iterate over targets, adding acceptable ones to the result list */
+        foreach_ptr(TargetEntry, tle, *targetlist)
+        {
+            /* Ignore junk TLEs */
+            if (tle->resjunk)
+                continue;
+
+            /*
+             * TLEs containing aggregates are not okay to add to GROUP BY
+             * (compare checkTargetlistEntrySQL92).  But the SQL standard
+             * directs us to skip them, so it's fine.
+             */
+            if (pstate->p_hasAggs &&
+                contain_aggs_of_level((Node *) tle->expr, 0))
+                continue;
+
+            /*
+             * Likewise, TLEs containing window functions are not okay to add
+             * to GROUP BY.  At this writing, the SQL standard is silent on
+             * what to do with them, but by analogy to aggregates we'll just
+             * skip them.
+             */
+            if (pstate->p_hasWindowFuncs &&
+                contain_windowfuncs((Node *) tle->expr))
+                continue;
+
+            /*
+             * Otherwise, add the TLE to the result using default sort/group
+             * semantics.  We specify the parse location as the TLE's
+             * location, despite the comment for addTargetToGroupList
+             * discouraging that.  The only other thing we could point to is
+             * the ALL keyword, which seems unhelpful when there are multiple
+             * TLEs.
+             */
+            result = addTargetToGroupList(pstate, tle,
+                                          result, *targetlist,
+                                          exprLocation((Node *) tle->expr));
+        }
+
+        /* If we found any acceptable targets, we're done */
+        if (result != NIL)
+            return result;
+
+        /*
+         * Otherwise, the SQL standard says to treat it like "GROUP BY ()".
+         * Build a representation of that, and let the rest of this function
+         * handle it.
+         */
+        grouplist = list_make1(makeGroupingSet(GROUPING_SET_EMPTY, NIL, -1));
+    }
+
     /*
      * Recursively flatten implicit RowExprs. (Technically this is only needed
      * for GROUP BY, per the syntax rules for grouping sets, but we do it
@@ -2818,6 +2880,7 @@ transformWindowDefinitions(ParseState *pstate,
                                           true /* force SQL99 rules */ );
         partitionClause = transformGroupClause(pstate,
                                                windef->partitionClause,
+                                               false /* not GROUP BY ALL */ ,
                                                NULL,
                                                targetlist,
                                                orderClause,
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index defcdaa8b34..d0c6d3b55bd 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -6186,7 +6186,9 @@ get_basic_select_query(Query *query, deparse_context *context)
         save_ingroupby = context->inGroupBy;
         context->inGroupBy = true;

-        if (query->groupingSets == NIL)
+        if (query->groupAll)
+            appendStringInfoString(buf, "ALL");
+        else if (query->groupingSets == NIL)
         {
             sep = "";
             foreach(l, query->groupClause)
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index f1706df58fd..aa73f7d76dd 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -214,7 +214,8 @@ typedef struct Query
     List       *returningList;    /* return-values list (of TargetEntry) */

     List       *groupClause;    /* a list of SortGroupClause's */
-    bool        groupDistinct;    /* is the group by clause distinct? */
+    bool        groupDistinct;    /* was GROUP BY DISTINCT used? */
+    bool        groupAll;        /* was GROUP BY ALL used? */

     List       *groupingSets;    /* a list of GroupingSet's if present */

@@ -2192,6 +2193,7 @@ typedef struct SelectStmt
     Node       *whereClause;    /* WHERE qualification */
     List       *groupClause;    /* GROUP BY clauses */
     bool        groupDistinct;    /* Is this GROUP BY DISTINCT? */
+    bool        groupAll;        /* Is this GROUP BY ALL? */
     Node       *havingClause;    /* HAVING conditional-expression */
     List       *windowClause;    /* WINDOW window_name AS (...), ... */

diff --git a/src/include/parser/parse_clause.h b/src/include/parser/parse_clause.h
index 3e9894926de..ede3903d1dd 100644
--- a/src/include/parser/parse_clause.h
+++ b/src/include/parser/parse_clause.h
@@ -26,6 +26,7 @@ extern Node *transformLimitClause(ParseState *pstate, Node *clause,
                                   ParseExprKind exprKind, const char *constructName,
                                   LimitOption limitOption);
 extern List *transformGroupClause(ParseState *pstate, List *grouplist,
+                                  bool groupByAll,
                                   List **groupingSets,
                                   List **targetlist, List *sortClause,
                                   ParseExprKind exprKind, bool useSQL99);
diff --git a/src/test/regress/expected/aggregates.out b/src/test/regress/expected/aggregates.out
index 1f24f6ffd1f..2cdb7562846 100644
--- a/src/test/regress/expected/aggregates.out
+++ b/src/test/regress/expected/aggregates.out
@@ -1557,6 +1557,119 @@ drop table t2;
 drop table t3;
 drop table p_t1;
 --
+-- Test GROUP BY ALL
+--
+-- We don't care about the data here, just the proper transformation of the
+-- GROUP BY clause, so test some queries and verify the EXPLAIN plans.
+--
+CREATE TEMP TABLE t1 (
+  a int,
+  b int,
+  c int
+);
+-- basic example
+EXPLAIN (COSTS OFF) SELECT b, COUNT(*) FROM t1 GROUP BY ALL;
+      QUERY PLAN
+----------------------
+ HashAggregate
+   Group Key: b
+   ->  Seq Scan on t1
+(3 rows)
+
+-- multiple columns, non-consecutive order
+EXPLAIN (COSTS OFF) SELECT a, SUM(b), b FROM t1 GROUP BY ALL;
+      QUERY PLAN
+----------------------
+ HashAggregate
+   Group Key: a, b
+   ->  Seq Scan on t1
+(3 rows)
+
+-- multi columns, no aggregate
+EXPLAIN (COSTS OFF) SELECT a + b FROM t1 GROUP BY ALL;
+      QUERY PLAN
+----------------------
+ HashAggregate
+   Group Key: (a + b)
+   ->  Seq Scan on t1
+(3 rows)
+
+-- check we detect a non-top-level aggregate
+EXPLAIN (COSTS OFF) SELECT a, SUM(b) + 4 FROM t1 GROUP BY ALL;
+      QUERY PLAN
+----------------------
+ HashAggregate
+   Group Key: a
+   ->  Seq Scan on t1
+(3 rows)
+
+-- including grouped column is okay
+EXPLAIN (COSTS OFF) SELECT a, SUM(b) + a FROM t1 GROUP BY ALL;
+      QUERY PLAN
+----------------------
+ HashAggregate
+   Group Key: a
+   ->  Seq Scan on t1
+(3 rows)
+
+-- including non-grouped column, not so much
+EXPLAIN (COSTS OFF) SELECT a, SUM(b) + c FROM t1 GROUP BY ALL;
+ERROR:  column "t1.c" must appear in the GROUP BY clause or be used in an aggregate function
+LINE 1: EXPLAIN (COSTS OFF) SELECT a, SUM(b) + c FROM t1 GROUP BY AL...
+                                               ^
+-- all aggregates, should reduce to GROUP BY ()
+EXPLAIN (COSTS OFF) SELECT COUNT(a), SUM(b) FROM t1 GROUP BY ALL;
+      QUERY PLAN
+----------------------
+ Aggregate
+   Group Key: ()
+   ->  Seq Scan on t1
+(3 rows)
+
+-- likewise with empty target list
+EXPLAIN (COSTS OFF) SELECT FROM t1 GROUP BY ALL;
+      QUERY PLAN
+-----------------------
+ Result
+   Replaces: Aggregate
+(2 rows)
+
+-- window functions are not to be included in GROUP BY, either
+EXPLAIN (COSTS OFF) SELECT a, COUNT(a) OVER (PARTITION BY a) FROM t1 GROUP BY ALL;
+            QUERY PLAN
+----------------------------------
+ WindowAgg
+   Window: w1 AS (PARTITION BY a)
+   ->  Sort
+         Sort Key: a
+         ->  HashAggregate
+               Group Key: a
+               ->  Seq Scan on t1
+(7 rows)
+
+-- all cols
+EXPLAIN (COSTS OFF) SELECT *, count(*) FROM t1 GROUP BY ALL;
+      QUERY PLAN
+----------------------
+ HashAggregate
+   Group Key: a, b, c
+   ->  Seq Scan on t1
+(3 rows)
+
+-- verify deparsing of GROUP BY ALL
+CREATE TEMP VIEW v1 AS SELECT b, COUNT(*) FROM t1 GROUP BY ALL;
+SELECT pg_get_viewdef('v1'::regclass);
+    pg_get_viewdef
+-----------------------
+  SELECT b,           +
+     count(*) AS count+
+    FROM t1           +
+   GROUP BY ALL;
+(1 row)
+
+DROP VIEW v1;
+DROP TABLE t1;
+--
 -- Test GROUP BY matching of join columns that are type-coerced due to USING
 --
 create temp table t1(f1 int, f2 int);
diff --git a/src/test/regress/sql/aggregates.sql b/src/test/regress/sql/aggregates.sql
index 62540b1ffa4..ec41da493ad 100644
--- a/src/test/regress/sql/aggregates.sql
+++ b/src/test/regress/sql/aggregates.sql
@@ -549,6 +549,56 @@ drop table t2;
 drop table t3;
 drop table p_t1;

+--
+-- Test GROUP BY ALL
+--
+-- We don't care about the data here, just the proper transformation of the
+-- GROUP BY clause, so test some queries and verify the EXPLAIN plans.
+--
+
+CREATE TEMP TABLE t1 (
+  a int,
+  b int,
+  c int
+);
+
+-- basic example
+EXPLAIN (COSTS OFF) SELECT b, COUNT(*) FROM t1 GROUP BY ALL;
+
+-- multiple columns, non-consecutive order
+EXPLAIN (COSTS OFF) SELECT a, SUM(b), b FROM t1 GROUP BY ALL;
+
+-- multi columns, no aggregate
+EXPLAIN (COSTS OFF) SELECT a + b FROM t1 GROUP BY ALL;
+
+-- check we detect a non-top-level aggregate
+EXPLAIN (COSTS OFF) SELECT a, SUM(b) + 4 FROM t1 GROUP BY ALL;
+
+-- including grouped column is okay
+EXPLAIN (COSTS OFF) SELECT a, SUM(b) + a FROM t1 GROUP BY ALL;
+
+-- including non-grouped column, not so much
+EXPLAIN (COSTS OFF) SELECT a, SUM(b) + c FROM t1 GROUP BY ALL;
+
+-- all aggregates, should reduce to GROUP BY ()
+EXPLAIN (COSTS OFF) SELECT COUNT(a), SUM(b) FROM t1 GROUP BY ALL;
+
+-- likewise with empty target list
+EXPLAIN (COSTS OFF) SELECT FROM t1 GROUP BY ALL;
+
+-- window functions are not to be included in GROUP BY, either
+EXPLAIN (COSTS OFF) SELECT a, COUNT(a) OVER (PARTITION BY a) FROM t1 GROUP BY ALL;
+
+-- all cols
+EXPLAIN (COSTS OFF) SELECT *, count(*) FROM t1 GROUP BY ALL;
+
+-- verify deparsing of GROUP BY ALL
+CREATE TEMP VIEW v1 AS SELECT b, COUNT(*) FROM t1 GROUP BY ALL;
+SELECT pg_get_viewdef('v1'::regclass);
+
+DROP VIEW v1;
+DROP TABLE t1;
+
 --
 -- Test GROUP BY matching of join columns that are type-coerced due to USING
 --
--
2.43.7


Re: [PATCH] GROUP BY ALL

От
Chao Li
Дата:


On Sep 28, 2025, at 06:23, Tom Lane <tgl@sss.pgh.pa.us> wrote:

+ /* Iterate over targets, adding acceptable ones to the result list */
+ foreach_ptr(TargetEntry, tle, *targetlist)
+ {
+ /* Ignore junk TLEs */
+ if (tle->resjunk)
+ continue;

Do we want to specifically check “ctid”?

If a user does:

```
select ctid, col1, col2, ... from t group by all;
```

It would be equivalent to no group by. Combing “select ctid” with “group by all” seems totally useless, but when users do such things, we can optimize that.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/




Re: [PATCH] GROUP BY ALL

От
Tom Lane
Дата:
Chao Li <li.evan.chao@gmail.com> writes:
> Do we want to specifically check “ctid”?

No.  Complete waste of effort.

            regards, tom lane



Re: [PATCH] GROUP BY ALL

От
David Christensen
Дата:
On Sat, Sep 27, 2025 at 5:23 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Here's a v6 that's rebased up to HEAD and contains fixes for the
> semantic issues we discussed.  It still lacks documentation, but
> otherwise I think it's about ready to go.

Here is v7 with a stab at docs; fairly minimal at this point, but
touching the two areas that are likely to need adjusting.  When
adjusting the docs for sql-select, I noticed that the grammar also
supports `GROUP BY ALL <grouping_elements>`, so I also added a test to
ensure that this syntax is explicitly supported. (It seems like it
works as-is without further grammar adjustments, but I was a little
worried when I first saw that fact... :D)  Not sure that
aggregates.sql is still the right place for all of these bits, but it
does seem like having all things `GROUP BY ALL`-related tested in the
same place is a nice property, so leaving there for now.

David



Re: [PATCH] GROUP BY ALL

От
David Christensen
Дата:
On Sun, Sep 28, 2025 at 2:18 PM David Christensen <david@pgguru.net> wrote:
>
> On Sat, Sep 27, 2025 at 5:23 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >
> > Here's a v6 that's rebased up to HEAD and contains fixes for the
> > semantic issues we discussed.  It still lacks documentation, but
> > otherwise I think it's about ready to go.
>
> Here is v7 with a stab at docs; fairly minimal at this point, but
> touching the two areas that are likely to need adjusting.  When
> adjusting the docs for sql-select, I noticed that the grammar also
> supports `GROUP BY ALL <grouping_elements>`, so I also added a test to
> ensure that this syntax is explicitly supported. (It seems like it
> works as-is without further grammar adjustments, but I was a little
> worried when I first saw that fact... :D)  Not sure that
> aggregates.sql is still the right place for all of these bits, but it
> does seem like having all things `GROUP BY ALL`-related tested in the
> same place is a nice property, so leaving there for now.

This time with attachment!

Вложения

Re: [PATCH] GROUP BY ALL

От
Tom Lane
Дата:
David Christensen <david@pgguru.net> writes:
>> Here is v7 with a stab at docs; fairly minimal at this point, but
>> touching the two areas that are likely to need adjusting.

I did some more word-smithing on the docs and pushed it.

>> When
>> adjusting the docs for sql-select, I noticed that the grammar also
>> supports `GROUP BY ALL <grouping_elements>`, so I also added a test to
>> ensure that this syntax is explicitly supported.

+1, can't hurt.

>> (It seems like it
>> works as-is without further grammar adjustments, but I was a little
>> worried when I first saw that fact... :D)

Bison would have been vocal about it if you'd introduced any
ambiguity.  Still, I didn't feel like looking around to see if we
already covered this syntax, and I agree it's close enough to being
an issue to be worth covering.

Thanks for the patch!

            regards, tom lane



Re: [PATCH] GROUP BY ALL

От
David Christensen
Дата:
> On Sep 29, 2025, at 3:58 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> David Christensen <david@pgguru.net> writes:
>>> Here is v7 with a stab at docs; fairly minimal at this point, but
>>> touching the two areas that are likely to need adjusting.
>
> I did some more word-smithing on the docs and pushed it.
>
>>> When
>>> adjusting the docs for sql-select, I noticed that the grammar also
>>> supports `GROUP BY ALL <grouping_elements>`, so I also added a test to
>>> ensure that this syntax is explicitly supported.
>
> +1, can't hurt.
>
>>> (It seems like it
>>> works as-is without further grammar adjustments, but I was a little
>>> worried when I first saw that fact... :D)
>
> Bison would have been vocal about it if you'd introduced any
> ambiguity.  Still, I didn't feel like looking around to see if we
> already covered this syntax, and I agree it's close enough to being
> an issue to be worth covering.
>
> Thanks for the patch!

Great, thank you!

David




Re: [PATCH] GROUP BY ALL

От
Chao Li
Дата:


On Sep 29, 2025, at 04:34, David Christensen <david@pgguru.net> wrote:

On Sun, Sep 28, 2025 at 2:18 PM David Christensen <david@pgguru.net> wrote:

On Sat, Sep 27, 2025 at 5:23 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Here's a v6 that's rebased up to HEAD and contains fixes for the
semantic issues we discussed.  It still lacks documentation, but
otherwise I think it's about ready to go.

Here is v7 with a stab at docs; fairly minimal at this point, but
touching the two areas that are likely to need adjusting.  When
adjusting the docs for sql-select, I noticed that the grammar also
supports `GROUP BY ALL <grouping_elements>`, so I also added a test to
ensure that this syntax is explicitly supported. (It seems like it
works as-is without further grammar adjustments, but I was a little
worried when I first saw that fact... :D)  Not sure that
aggregates.sql is still the right place for all of these bits, but it
does seem like having all things `GROUP BY ALL`-related tested in the
same place is a nice property, so leaving there for now.

This time with attachment!
<v7-0001-Add-GROUP-BY-ALL.patch>

A nit comment for the doc:

```
+    not contain either an aggregate function or a window function in their
+    expression list.  This can greatly simplify ad-hoc exploration of data.
+   </para>
```

“In their expression list” => “in the <literal>SELECT</literal> list"

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/