Обсуждение: group by can use alias from select list
The following documentation comment has been logged on the website: Page: https://www.postgresql.org/docs/15/sql-select.html Description: hi i have a discussion in pgsql-bugs@lists.postgresql.org about this and it is mentioned in the answers that it is documented and a "feature" of Postgres that this can be done. If this is wanted the documentation (https://www.postgresql.org/docs/current/sql-select.html) that mentioned: " Description SELECT retrieves rows from zero or more tables. The general processing of SELECT is as follows: 1.) All queries in the WITH list are computed. These effectively serve as temporary tables that can be referenced in the FROM list. A WITH query that is referenced more than once in FROM is computed only once, unless specified otherwise with NOT MATERIALIZED. (See WITH Clause below.) 2.) All elements in the FROM list are computed. (Each element in the FROM list is a real or virtual table.) If more than one element is specified in the FROM list, they are cross-joined together. (See FROM Clause below.) 3.) If the WHERE clause is specified, all rows that do not satisfy the condition are eliminated from the output. (See WHERE Clause below.) 4.) If the GROUP BY clause is specified, or if there are aggregate function calls, the output is combined into groups of rows that match on one or more values, and the results of aggregate functions are computed. If the HAVING clause is present, it eliminates groups that do not satisfy the given condition. (See GROUP BY Clause and HAVING Clause below.) 5.) The actual output rows are computed using the SELECT output expressions for each selected row or row group. (See SELECT List below.) " isn't correct because how can 4.) be done and the alias from 5.) is used? Here is a hint important that there is an exception for alias used in group by the SQL for that: with tbase(id) as (select 1 union all select 2) select id otto from tbase group by otto order by otto what do you think about that? hape
On Fri, 2023-07-07 at 07:36 +0000, PG Doc comments form wrote: > i have a discussion in pgsql-bugs@lists.postgresql.org about this and it is > mentioned in the answers that it is documented and a "feature" of Postgres > that this can be done. > If this is wanted the documentation > (https://www.postgresql.org/docs/current/sql-select.html) that mentioned: > [execution order of SELECT that says that GROUP BY is before SELECT] > isn't correct because how can 4.) be done and the alias from 5.) is used? > Here is a hint important that there is an exception for alias used in group > by I think that is already documented: > An expression used inside a grouping_element can be an input column name, > or the name or ordinal number of an output column (SELECT list item), > or an arbitrary expression formed from input-column values. An alias in this case would be an output column. Perhaps we can mention the alias explicitly. Yours, Laurenz Albe
On Tue, Jul 11, 2023 at 7:19 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Fri, 2023-07-07 at 07:36 +0000, PG Doc comments form wrote:
> i have a discussion in pgsql-bugs@lists.postgresql.org about this and it is
> mentioned in the answers that it is documented and a "feature" of Postgres
> that this can be done.
> If this is wanted the documentation
> (https://www.postgresql.org/docs/current/sql-select.html) that mentioned:
> [execution order of SELECT that says that GROUP BY is before SELECT]
> isn't correct because how can 4.) be done and the alias from 5.) is used?
> Here is a hint important that there is an exception for alias used in group
> by
I think that is already documented:
I think the complaint is that someone seeing the behavior in the wild comes to this order-of-operations and doesn't see that the observed behavior is documented. Sure, they can go into the GROUP BY section and figure out that there is a "oh, by the way" comment within there that output columns/aliases are indeed allowed. But I tend to agree that a mention in the "order of operations" section that output columns from step 5 can be seen in step 4 seems like an improvement if we really want the order of operations to be the main reference entry point for people trying to work out query behavior. Something like: "(while unadvised it is possible for the aliases defined in the next step to be used here as well)".
David J.
On Tue, Jul 11, 2023 at 07:31:26AM -0700, David G. Johnston wrote: > On Tue, Jul 11, 2023 at 7:19 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote: > > On Fri, 2023-07-07 at 07:36 +0000, PG Doc comments form wrote: > > i have a discussion in pgsql-bugs@lists.postgresql.org about this and it > is > > mentioned in the answers that it is documented and a "feature" of > Postgres > > that this can be done. > > If this is wanted the documentation > > (https://www.postgresql.org/docs/current/sql-select.html) that mentioned: > > [execution order of SELECT that says that GROUP BY is before SELECT] > > isn't correct because how can 4.) be done and the alias from 5.) is > used? > > Here is a hint important that there is an exception for alias used in > group > > by > > I think that is already documented: > > > > I think the complaint is that someone seeing the behavior in the wild comes to > this order-of-operations and doesn't see that the observed behavior is > documented. Sure, they can go into the GROUP BY section and figure out that > there is a "oh, by the way" comment within there that output columns/aliases > are indeed allowed. But I tend to agree that a mention in the "order of > operations" section that output columns from step 5 can be seen in step 4 seems > like an improvement if we really want the order of operations to be the main > reference entry point for people trying to work out query behavior. Something > like: "(while unadvised it is possible for the aliases defined in the next > step to be used here as well)". How is the attached patch? -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com Only you can decide what is important to you.
Вложения
Bruce Momjian <bruce@momjian.us> writes: > On Tue, Jul 11, 2023 at 07:31:26AM -0700, David G. Johnston wrote: >> I think the complaint is that someone seeing the behavior in the wild comes to >> this order-of-operations and doesn't see that the observed behavior is >> documented. > How is the attached patch? Maybe better "Although query output columns are nominally computed in the next step, they can also be referenced (by name or by ordinal number) as <literal>GROUP BY</literal> elements". You could go further and add "Such references cannot be parts of <literal>GROUP BY</literal> expressions, however." Not sure if we cover that explicitly anywhere else. regards, tom lane
On Wed, Sep 6, 2023 at 08:01:30PM -0400, Tom Lane wrote: > Bruce Momjian <bruce@momjian.us> writes: > > On Tue, Jul 11, 2023 at 07:31:26AM -0700, David G. Johnston wrote: > >> I think the complaint is that someone seeing the behavior in the wild comes to > >> this order-of-operations and doesn't see that the observed behavior is > >> documented. > > > How is the attached patch? > > Maybe better "Although query output columns are nominally computed in the > next step, they can also be referenced (by name or by ordinal number) > as <literal>GROUP BY</literal> elements". I modified the text slightly in the attached patch. > You could go further and add "Such references cannot be parts of > <literal>GROUP BY</literal> expressions, however." Not sure if we > cover that explicitly anywhere else. We do farther down the docs in the GROUP BY section: https://www.postgresql.org/docs/devel/sql-select.html GROUP BY will condense into a single row all selected rows that share the same values for the grouped expressions. An expression used inside a grouping_element can be an input column name, or the name or ordinal number of an output column (SELECT list item), or an arbitrary expression formed from input-column values. In case of ambiguity, a ------------------------ GROUP BY name will be interpreted as an input-column name rather than an output column name. It says * input column name * output column name or ordinal number * expression formed from input-column values Is this too subtle? -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com Only you can decide what is important to you.
Вложения
On Thu, Sep 7, 2023 at 10:11:00AM -0400, Bruce Momjian wrote: > We do farther down the docs in the GROUP BY section: > > https://www.postgresql.org/docs/devel/sql-select.html > > GROUP BY will condense into a single row all selected rows that share > the same values for the grouped expressions. An expression used inside a > grouping_element can be an input column name, or the name or ordinal > number of an output column (SELECT list item), or an arbitrary > expression formed from input-column values. In case of ambiguity, a > ------------------------ > GROUP BY name will be interpreted as an input-column name rather than an > output column name. > > It says > > * input column name > * output column name or ordinal number > * expression formed from input-column values > > Is this too subtle? Patch applied back to PG 11. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com Only you can decide what is important to you.