Обсуждение: Document aggregate functions better w.r.t. ORDER BY

Поиск
Список
Период
Сортировка

Document aggregate functions better w.r.t. ORDER BY

От
"David G. Johnston"
Дата:
All,

The recent discussion surrounding aggregates and ORDER BY moved me to look over our existing documentation, especially now that we've reworked the function tables, to see what improvements can be had by simply documenting those functions where ORDER BY may change the user-visible output.  I skipped range aggregates for the moment but handled the others on the aggregates page (not window functions).  This includes the float types for sum and avg.

I added a note just before the table linking back to the syntax chapter and describing the newly added rules and syntax choice in the table.

The nuances of floating point math suggest to me that specifying order by for those is in some kind of gray area and so I've marked it optional...any suggestions for wording (or an xref) to explain those nuances or should it just be shown non-optional like the others?  Or not shown at all?

The novelty of my examples is up for bikeshedding.  I didn't want anything too long so a subquery didn't make sense, and I was trying to avoid duplication as well as multiple lines - hence creating a CTE that can be copied onto all of the example queries to produce the noted result.

I added a DISTINCT example to array_agg because it is the first aggregate on the page and so hopefully will be seen during a cursory reading.  Plus, array_agg is the go-to function for doing this kind of experimentation.

David J.

The patch is attached.  A screenshot exemplifying the changes is copied inline and attached.

image.png
Вложения

Re: Document aggregate functions better w.r.t. ORDER BY

От
Bruce Momjian
Дата:
On Tue, Dec 13, 2022 at 07:38:15PM -0700, David G. Johnston wrote:
> All,
> 
> The recent discussion surrounding aggregates and ORDER BY moved me to look over
> our existing documentation, especially now that we've reworked the function
> tables, to see what improvements can be had by simply documenting those
> functions where ORDER BY may change the user-visible output.  I skipped range
> aggregates for the moment but handled the others on the aggregates page (not
> window functions).  This includes the float types for sum and avg.
> 
> I added a note just before the table linking back to the syntax chapter and
> describing the newly added rules and syntax choice in the table.
> 
> The nuances of floating point math suggest to me that specifying order by for
> those is in some kind of gray area and so I've marked it optional...any
> suggestions for wording (or an xref) to explain those nuances or should it just
> be shown non-optional like the others?  Or not shown at all?
> 
> The novelty of my examples is up for bikeshedding.  I didn't want anything too
> long so a subquery didn't make sense, and I was trying to avoid duplication as
> well as multiple lines - hence creating a CTE that can be copied onto all of
> the example queries to produce the noted result.
> 
> I added a DISTINCT example to array_agg because it is the first aggregate on
> the page and so hopefully will be seen during a cursory reading.  Plus,
> array_agg is the go-to function for doing this kind of experimentation.

I like this idea, though the examples seemed too detailed so I skipped
them.  Here is the trimmed-down patch I would like to apply.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.

Вложения

Re: Document aggregate functions better w.r.t. ORDER BY

От
"David G. Johnston"
Дата:
On Tue, Oct 24, 2023 at 1:39 PM Bruce Momjian <bruce@momjian.us> wrote:
On Tue, Dec 13, 2022 at 07:38:15PM -0700, David G. Johnston wrote:
> All,
>
> The recent discussion surrounding aggregates and ORDER BY moved me to look over
> our existing documentation, especially now that we've reworked the function
> tables, to see what improvements can be had by simply documenting those
> functions where ORDER BY may change the user-visible output.  I skipped range
> aggregates for the moment but handled the others on the aggregates page (not
> window functions).  This includes the float types for sum and avg.
>
> I added a note just before the table linking back to the syntax chapter and
> describing the newly added rules and syntax choice in the table.
>
> The nuances of floating point math suggest to me that specifying order by for
> those is in some kind of gray area and so I've marked it optional...any
> suggestions for wording (or an xref) to explain those nuances or should it just
> be shown non-optional like the others?  Or not shown at all?
>
> The novelty of my examples is up for bikeshedding.  I didn't want anything too
> long so a subquery didn't make sense, and I was trying to avoid duplication as
> well as multiple lines - hence creating a CTE that can be copied onto all of
> the example queries to produce the noted result.
>
> I added a DISTINCT example to array_agg because it is the first aggregate on
> the page and so hopefully will be seen during a cursory reading.  Plus,
> array_agg is the go-to function for doing this kind of experimentation.

I like this idea, though the examples seemed too detailed so I skipped
them.  Here is the trimmed-down patch I would like to apply.


I'd prefer to keep pointing out that the ones documented are those whose outputs will vary due to ordering.

I've been sympathetic to the user comments that we don't have enough examples.  Just using array_agg for that purpose, showing both DISTINCT and ORDER BY seems like a fair compromise (removes two from my original proposal).  The examples in the section we tell them to go see aren't of that great quality.  If you strongly dislike having the function table contain the examples we should at least improve the page we are sending them to.  (As an aside to this, I've personally always found the syntax block with the 5 syntaxes shown there to be intimidating/hard-to-read).

I'd at least suggest you reconsider the commentary and examples surrounding jsonb_object_agg.

The same goes for the special knowledge of floating point behavior for why we've chosen to document avg/sum, something that typically doesn't care about order, as having an optional order by.

David J.

Re: Document aggregate functions better w.r.t. ORDER BY

От
Bruce Momjian
Дата:
On Tue, Oct 24, 2023 at 06:45:48PM -0700, David G. Johnston wrote:
> I'd prefer to keep pointing out that the ones documented are those whose
> outputs will vary due to ordering.

Okay, I re-added it in the attached patch, and tightened up the text.

> I've been sympathetic to the user comments that we don't have enough examples. 

Good point.

> Just using array_agg for that purpose, showing both DISTINCT and ORDER BY seems
> like a fair compromise (removes two from my original proposal).  The examples
> in the section we tell them to go see aren't of that great quality.  If you
> strongly dislike having the function table contain the examples we should at
> least improve the page we are sending them to.  (As an aside to this, I've
> personally always found the syntax block with the 5 syntaxes shown there to be
> intimidating/hard-to-read).

I think you are right that it belongs in the syntax section;  we cover
ordering extensively there.  We already have queries there, but not
output, so I moved the relevant examples to there and replaced the
example that had no output.

> I'd at least suggest you reconsider the commentary and examples surrounding
> jsonb_object_agg.

I moved that as well, and tightened the example.

> The same goes for the special knowledge of floating point behavior for why
> we've chosen to document avg/sum, something that typically doesn't care about
> order, as having an optional order by.

The floating example seems too obscure to mention in our function docs. 
I can put a sentence in the syntax docs, but is there value in
explaining that to users?  How it that helpful?  Example?

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.

Вложения

Re: Document aggregate functions better w.r.t. ORDER BY

От
"David G. Johnston"
Дата:
On Wed, Oct 25, 2023 at 8:36 AM Bruce Momjian <bruce@momjian.us> wrote:
On Tue, Oct 24, 2023 at 06:45:48PM -0700, David G. Johnston wrote:
> I'd prefer to keep pointing out that the ones documented are those whose
> outputs will vary due to ordering.

Okay, I re-added it in the attached patch, and tightened up the text.

Thanks


I think you are right that it belongs in the syntax section;  we cover
ordering extensively there.  We already have queries there, but not
output, so I moved the relevant examples to there and replaced the
example that had no output.

Thanks
 
> The same goes for the special knowledge of floating point behavior for why
> we've chosen to document avg/sum, something that typically doesn't care about
> order, as having an optional order by.

The floating example seems too obscure to mention in our function docs.
I can put a sentence in the syntax docs, but is there value in
explaining that to users?  How it that helpful?  Example?


Yeah, we punt on the entire concept in the data type section:

"Managing these errors and how they propagate through calculations is the subject of an entire branch of mathematics and computer science and will not be discussed here," ...

Also, I'm now led to believe that the relevant IEEE 754 floating point addition is indeed commutative.  Given that, I am inclined to simply not add the order by clause at all to those four functions. (actually, you already got rid of the avg()s but the sum()s are still present, so just those two).

David J.

Re: Document aggregate functions better w.r.t. ORDER BY

От
Bruce Momjian
Дата:
On Wed, Oct 25, 2023 at 04:14:11PM -0700, David G. Johnston wrote:
> Yeah, we punt on the entire concept in the data type section:
> 
> "Managing these errors and how they propagate through calculations is the
> subject of an entire branch of mathematics and computer science and will not be
> discussed here," ...
> 
> Also, I'm now led to believe that the relevant IEEE 754 floating point addition
> is indeed commutative.  Given that, I am inclined to simply not add the order
> by clause at all to those four functions. (actually, you already got rid of the
> avg()s but the sum()s are still present, so just those two).

Ah, yes, sum() removed.  Updated patch attached.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.

Вложения

Re: Document aggregate functions better w.r.t. ORDER BY

От
"David G. Johnston"
Дата:
On Wed, Oct 25, 2023 at 4:22 PM Bruce Momjian <bruce@momjian.us> wrote:
On Wed, Oct 25, 2023 at 04:14:11PM -0700, David G. Johnston wrote:
> Yeah, we punt on the entire concept in the data type section:
>
> "Managing these errors and how they propagate through calculations is the
> subject of an entire branch of mathematics and computer science and will not be
> discussed here," ...
>
> Also, I'm now led to believe that the relevant IEEE 754 floating point addition
> is indeed commutative.  Given that, I am inclined to simply not add the order
> by clause at all to those four functions. (actually, you already got rid of the
> avg()s but the sum()s are still present, so just those two).

Ah, yes, sum() removed.  Updated patch attached.


The paragraph leading into the last added example needs to be tweaked:

If DISTINCT is specified within an aggregate, the data is sorted in ascending order while extracting unique values.  You can add an ORDER BY clause, limited to expressions matching the regular arguments of the aggregate, to sort the output in descending order.

(show existing - DISTINCT only - example here)

<programlisting>
WITH vals (v) AS ( VALUES (1),(3),(4),(3),(2) )
SELECT string_agg(DISTINCT v::text, ';' ORDER BY v::text DESC) FROM vals;
 string_agg
-----------
  4;3;2;1
</programlisting>

(existing note)

Question: Do you know whether we for certain always sort ascending here to compute the unique values or whether if, say, there is an index on the column in descending order (or ascending and traversed backwards) that the data within the aggregate could, with an order by, be returned in descending order?  If it is ascending, is that part of the SQL Standard (since it doesn't even allow an order by to give the user the ability the control the output ordering) or does the SQL Standard expect that even a random order would be fine since there are algorithms that can be used that do not involve sorting the input?

It seems redundant to first say "regular arguments" then negate it in order to say "DISTINCT list".  Using the positive form with "DISTINCT list" should get the point across sufficiently and succinctly.  It also avoids me feeling like there should be an example of what happens when you do "sort on an expression that is not included in the DISTINCT list".

Interestingly:

WITH vals (v,l) AS ( VALUES (1,'Z'),(3,'D'),(4,'R'),(3,'A'),(2,'T') )
SELECT string_agg(DISTINCT l, ';' ORDER BY l, ';' DESC) FROM vals;

ERROR:  in an aggregate with DISTINCT, ORDER BY expressions must appear in argument list
LINE 2: SELECT string_agg(DISTINCT l, ';' ORDER BY l, ';' DESC) FROM...

But both expressions in the argument list (el and semicolon) do appear in the ORDER BY...

David J.

Re: Document aggregate functions better w.r.t. ORDER BY

От
Bruce Momjian
Дата:
On Wed, Oct 25, 2023 at 05:10:17PM -0700, David G. Johnston wrote:
> The paragraph leading into the last added example needs to be tweaked:
> 
> If DISTINCT is specified within an aggregate, the data is sorted in ascending
> order while extracting unique values.  You can add an ORDER BY clause, limited
> to expressions matching the regular arguments of the aggregate, to sort the
> output in descending order.
> 
> (show existing - DISTINCT only - example here)
> 
> <programlisting>
> WITH vals (v) AS ( VALUES (1),(3),(4),(3),(2) )
> SELECT string_agg(DISTINCT v::text, ';' ORDER BY v::text DESC) FROM vals;
>  string_agg
> -----------
>   4;3;2;1
> </programlisting>
> 
> (existing note)

I see what you mean.  I added an example that doesn't match the existing
paragraph.  I have rewritten the paragraph and used a relevant example;
patch attached.

> Question: Do you know whether we for certain always sort ascending here to
> compute the unique values or whether if, say, there is an index on the column
> in descending order (or ascending and traversed backwards) that the data within
> the aggregate could, with an order by, be returned in descending order?  If it
> is ascending, is that part of the SQL Standard (since it doesn't even allow an
> order by to give the user the ability the control the output ordering) or does
> the SQL Standard expect that even a random order would be fine since there are
> algorithms that can be used that do not involve sorting the input?

I don't think order is ever guaranteed in the standard without an ORDER
BY.

> It seems redundant to first say "regular arguments" then negate it in order to
> say "DISTINCT list".  Using the positive form with "DISTINCT list" should get
> the point across sufficiently and succinctly.  It also avoids me feeling like
> there should be an example of what happens when you do "sort on an expression
> that is not included in the DISTINCT list".

Agreed, I rewrote that.

> Interestingly:
> 
> WITH vals (v,l) AS ( VALUES (1,'Z'),(3,'D'),(4,'R'),(3,'A'),(2,'T') )
> SELECT string_agg(DISTINCT l, ';' ORDER BY l, ';' DESC) FROM vals;
> 
> ERROR:  in an aggregate with DISTINCT, ORDER BY expressions must appear in
> argument list
> LINE 2: SELECT string_agg(DISTINCT l, ';' ORDER BY l, ';' DESC) FROM...
> 
> But both expressions in the argument list (el and semicolon) do appear in the
> ORDER BY...

I think ORDER BY has to match DISTINCT columns, while you are using ';'.
I used a simpler example with array_agg() in my patch to avoid the issue.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.

Вложения

Re: Document aggregate functions better w.r.t. ORDER BY

От
David Rowley
Дата:
On Thu, 26 Oct 2023 at 13:10, David G. Johnston
<david.g.johnston@gmail.com> wrote:
> Question: Do you know whether we for certain always sort ascending here to compute the unique values or whether if,
say,there is an index on the column in descending order (or ascending and traversed backwards) that the data within the
aggregatecould, with an order by, be returned in descending order? 

The way it's currently coded, we seem to always require ascending
order.  See addTargetToGroupList().  The call to
get_sort_group_operators() only requests the ltOpr.

A quick test creating an index on a column with DESC shows that we end
up doing a backwards index scan so that we get the requested ascending
order:

create table b (b text);
create index on b (b desc);
explain select string_agg(distinct b,',') from b;
                                        QUERY PLAN
------------------------------------------------------------------------------------------
 Aggregate  (cost=67.95..67.97 rows=1 width=32)
   ->  Index Only Scan Backward using b_b_idx on b  (cost=0.15..64.55
rows=1360 width=32)
(2 rows)

However, I think we'd best stay clear of offering any guarantees in
the documents about this.  If we did that it would be much harder in
the future if we wanted to implement the DISTINCT aggregates by
hashing.

David



Re: Document aggregate functions better w.r.t. ORDER BY

От
"David G. Johnston"
Дата:
On Wed, Oct 25, 2023 at 7:13 PM David Rowley <dgrowleyml@gmail.com> wrote:
On Thu, 26 Oct 2023 at 13:10, David G. Johnston
<david.g.johnston@gmail.com> wrote:
> Question: Do you know whether we for certain always sort ascending here to compute the unique values or whether if, say, there is an index on the column in descending order (or ascending and traversed backwards) that the data within the aggregate could, with an order by, be returned in descending order?

The way it's currently coded, we seem to always require ascending
order.  See addTargetToGroupList().  The call to
get_sort_group_operators() only requests the ltOpr.

A quick test creating an index on a column with DESC shows that we end
up doing a backwards index scan so that we get the requested ascending
order:

create table b (b text);
create index on b (b desc);
explain select string_agg(distinct b,',') from b;
                                        QUERY PLAN
------------------------------------------------------------------------------------------
 Aggregate  (cost=67.95..67.97 rows=1 width=32)
   ->  Index Only Scan Backward using b_b_idx on b  (cost=0.15..64.55
rows=1360 width=32)
(2 rows)

However, I think we'd best stay clear of offering any guarantees in
the documents about this.  If we did that it would be much harder in
the future if we wanted to implement the DISTINCT aggregates by
hashing.

So, I think we are mischaracterizing the Standard here, if only in the specific case of array_agg.

SQL Standard: 4.16.4

Every unary aggregate function takes an arbitrary <value expression> as the argument; most unary aggregate
functions can optionally be qualified with either DISTINCT or ALL.

If ARRAY_AGG is specified, then an array value with one element formed from the <value expression>
evaluated for each row that qualifies.

Neither DISTINCT nor ALL are allowed to be specified for VAR_POP, VAR_SAMP, STDDEV_POP, or
STDDEV_SAMP; redundant duplicates are not removed when computing these functions.

10.9

<array aggregate function> ::=
ARRAY_AGG
<left paren> <value expression> [ ORDER BY <sort specification list> ] <right paren>

I would reword the existing note to be something like:

The SQL Standard defines specific aggregates and their properties, including which of DISTINCT and/or ORDER BY is allowed.  Due to the extensible nature of PostgreSQL it accepts either or both clauses for any aggregate.

From the most recent patch:

    <para>
-    If <literal>DISTINCT</literal> is specified in addition to an
-    <replaceable>order_by_clause</replaceable>, then all the <literal>ORDER BY</literal>
-    expressions must match regular arguments of the aggregate; that is,
-    you cannot sort on an expression that is not included in the
-    <literal>DISTINCT</literal> list.
+    If <literal>DISTINCT</literal> is specified with an
+    <replaceable>order_by_clause</replaceable>, <literal>ORDER
+    BY</literal> expressions can only reference columns in the
+    <literal>DISTINCT</literal> list.  For example:
+<programlisting>
+WITH vals (v1, v2) AS ( VALUES (1,'Z'),(3,'D'),(4,'R'),(3,'A'),(2,'T') )
+SELECT array_agg(DISTINCT v2 ORDER BY v2 DESC) FROM vals;
+  array_agg
+-------------
+ {Z,T,R,D,A}
+</programlisting>

The change to a two-column vals was mostly to try and find corner-cases that might need to be addressed.  If we don't intend to show the error case of DISTINCT v1 ORDER BY v2 then we should go back to the original example and just add ORDER BY v DESC.  I'm fine with not using string_agg here.

+    For example:
+<programlisting>
+WITH vals (v) AS ( VALUES (1),(3),(4),(3),(2) )
+SELECT array_agg(DISTINCT v ORDER BY v DESC) FROM vals;
+ array_agg
+-----------
+ {4,3,2,1}
+</programlisting>

We get enough complaints regarding "apparent ordering" that I would like to add:

As a reminder, while some DISTINCT processing algorithms produce sorted output as a side-effect, only by specifying ORDER BY is the output order guaranteed.

David J.


Re: Document aggregate functions better w.r.t. ORDER BY

От
Bruce Momjian
Дата:
On Wed, Oct 25, 2023 at 10:34:10PM -0700, David G. Johnston wrote:
> I would reword the existing note to be something like:
> 
> The SQL Standard defines specific aggregates and their properties, including
> which of DISTINCT and/or ORDER BY is allowed.  Due to the extensible nature of
> PostgreSQL it accepts either or both clauses for any aggregate.

Uh, is this something in my patch or somewhere else?  I don't think
PostgreSQL extensible is an example of syntax flexibility.

> From the most recent patch:
> 
>     <para>
> -    If <literal>DISTINCT</literal> is specified in addition to an
> -    <replaceable>order_by_clause</replaceable>, then all the <literal>ORDER BY
> </literal>
> -    expressions must match regular arguments of the aggregate; that is,
> -    you cannot sort on an expression that is not included in the
> -    <literal>DISTINCT</literal> list.
> +    If <literal>DISTINCT</literal> is specified with an
> +    <replaceable>order_by_clause</replaceable>, <literal>ORDER
> +    BY</literal> expressions can only reference columns in the
> +    <literal>DISTINCT</literal> list.  For example:
> +<programlisting>
> +WITH vals (v1, v2) AS ( VALUES (1,'Z'),(3,'D'),(4,'R'),(3,'A'),(2,'T') )
> +SELECT array_agg(DISTINCT v2 ORDER BY v2 DESC) FROM vals;
> +  array_agg
> +-------------
> + {Z,T,R,D,A}
> +</programlisting>
> 
> The change to a two-column vals was mostly to try and find corner-cases that
> might need to be addressed.  If we don't intend to show the error case of
> DISTINCT v1 ORDER BY v2 then we should go back to the original example and just
> add ORDER BY v DESC.  I'm fine with not using string_agg here.
> 
> +    For example:
> +<programlisting>
> +WITH vals (v) AS ( VALUES (1),(3),(4),(3),(2) )
> +SELECT array_agg(DISTINCT v ORDER BY v DESC) FROM vals;
> + array_agg
> +-----------
> + {4,3,2,1}
> +</programlisting>

Okay, good, switched in the attached patch.

> We get enough complaints regarding "apparent ordering" that I would like to
> add:
> 
> As a reminder, while some DISTINCT processing algorithms produce sorted output
> as a side-effect, only by specifying ORDER BY is the output order guaranteed.

Well, we need to create a new email thread for this and look at all the
areas is applies to since this is a much larger issue.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.

Вложения

Re: Document aggregate functions better w.r.t. ORDER BY

От
"David G. Johnston"
Дата:
On Thu, Oct 26, 2023 at 2:56 PM Bruce Momjian <bruce@momjian.us> wrote:
On Wed, Oct 25, 2023 at 10:34:10PM -0700, David G. Johnston wrote:
> I would reword the existing note to be something like:
>
> The SQL Standard defines specific aggregates and their properties, including
> which of DISTINCT and/or ORDER BY is allowed.  Due to the extensible nature of
> PostgreSQL it accepts either or both clauses for any aggregate.

Uh, is this something in my patch or somewhere else?  I don't think
PostgreSQL extensible is an example of syntax flexibility.


Note
The ability to specify both DISTINCT and ORDER BY in an aggregate function is a PostgreSQL extension.

I am pointing out that the first sentence of the existing note above seems to be factually incorrect.  I tried to make it correct - while explaining why we differ.  Though in truth I'd probably rather just remove the note.

> We get enough complaints regarding "apparent ordering" that I would like to
> add:
>
> As a reminder, while some DISTINCT processing algorithms produce sorted output
> as a side-effect, only by specifying ORDER BY is the output order guaranteed.

Well, we need to create a new email thread for this and look at all the
areas is applies to since this is a much larger issue.


I was hoping to sneak this one in regardless of the bigger picture issues, since this specific combination is guaranteed to output ordered presently.

David J.

Re: Document aggregate functions better w.r.t. ORDER BY

От
Bruce Momjian
Дата:
On Thu, Oct 26, 2023 at 03:09:26PM -0700, David G. Johnston wrote:
> On Thu, Oct 26, 2023 at 2:56 PM Bruce Momjian <bruce@momjian.us> wrote:
> 
>     On Wed, Oct 25, 2023 at 10:34:10PM -0700, David G. Johnston wrote:
>     > I would reword the existing note to be something like:
>     >
>     > The SQL Standard defines specific aggregates and their properties,
>     including
>     > which of DISTINCT and/or ORDER BY is allowed.  Due to the extensible
>     nature of
>     > PostgreSQL it accepts either or both clauses for any aggregate.
> 
>     Uh, is this something in my patch or somewhere else?  I don't think
>     PostgreSQL extensible is an example of syntax flexibility.
> 
> 
> https://www.postgresql.org/docs/current/sql-expressions.html#SYNTAX-AGGREGATES
> 
> Note
> The ability to specify both DISTINCT and ORDER BY in an aggregate function is a
> PostgreSQL extension.
> 
> I am pointing out that the first sentence of the existing note above seems to
> be factually incorrect.  I tried to make it correct - while explaining why we
> differ.  Though in truth I'd probably rather just remove the note.

Agreed, removed, patch attached.  This is just too complex to specify.

>     > We get enough complaints regarding "apparent ordering" that I would like
>     to
>     > add:
>     >
>     > As a reminder, while some DISTINCT processing algorithms produce sorted
>     output
>     > as a side-effect, only by specifying ORDER BY is the output order
>     guaranteed.
> 
>     Well, we need to create a new email thread for this and look at all the
>     areas is applies to since this is a much larger issue.
> 
> I was hoping to sneak this one in regardless of the bigger picture issues,
> since this specific combination is guaranteed to output ordered presently.

No sneaking.  ;-)  It would be bad to document this unevenly because it
sets expectations in other parts of the system if we don't mention it.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.

Вложения

Re: Document aggregate functions better w.r.t. ORDER BY

От
"David G. Johnston"
Дата:
On Thu, Oct 26, 2023 at 3:36 PM Bruce Momjian <bruce@momjian.us> wrote:
No sneaking.  ;-)  It would be bad to document this unevenly because it
sets expectations in other parts of the system if we don't mention it.

Agreed.

Last suggestion, remove the first jsonb_agg example that lacks an order by.

+WITH vals (k, v) AS ( VALUES ('key0','1'), ('key1','3'), ('key1','2') )
+SELECT jsonb_object_agg(k, v) FROM vals;
+      jsonb_object_agg
+----------------------------
+ {"key0": "1", "key1": "2"}
+

We shouldn't write an example that relies on the rows being evaluated 1-2-3 without specifying an order by clause.

David J.

Re: Document aggregate functions better w.r.t. ORDER BY

От
Bruce Momjian
Дата:
On Thu, Oct 26, 2023 at 03:44:14PM -0700, David G. Johnston wrote:
> On Thu, Oct 26, 2023 at 3:36 PM Bruce Momjian <bruce@momjian.us> wrote:
> 
>     No sneaking.  ;-)  It would be bad to document this unevenly because it
>     sets expectations in other parts of the system if we don't mention it.
> 
> 
> Agreed.
> 
> Last suggestion, remove the first jsonb_agg example that lacks an order by.
> 
> +WITH vals (k, v) AS ( VALUES ('key0','1'), ('key1','3'), ('key1','2') )
> +SELECT jsonb_object_agg(k, v) FROM vals;
> +      jsonb_object_agg
> +----------------------------
> + {"key0": "1", "key1": "2"}
> +
> 
> We shouldn't write an example that relies on the rows being evaluated 1-2-3
> without specifying an order by clause.

Sure, done in the attached patch.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.

Вложения

Re: Document aggregate functions better w.r.t. ORDER BY

От
"David G. Johnston"
Дата:
On Thu, Oct 26, 2023 at 4:03 PM Bruce Momjian <bruce@momjian.us> wrote:

Sure, done in the attached patch.


WFM.  Thank You!

David J.

Re: Document aggregate functions better w.r.t. ORDER BY

От
Bruce Momjian
Дата:
On Thu, Oct 26, 2023 at 04:05:12PM -0700, David G. Johnston wrote:
> On Thu, Oct 26, 2023 at 4:03 PM Bruce Momjian <bruce@momjian.us> wrote:
> 
> 
>     Sure, done in the attached patch.
> 
> 
> 
> WFM.  Thank You!

Patch applied to master.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.