Обсуждение: Case in Order By Ignored without warning or error

Поиск
Список
Период
Сортировка

Case in Order By Ignored without warning or error

От
Emiel Hermsen
Дата:
Hello,

I'm currently working with PostGreSQL 9.3 on a RedHat 6.6 device.
One of my predecessors decided he wanted dynamic sorting which seems to be
ignored.

My made-up testing table definition is as follows:
CREATE TABLE films (
    id        SERIAL PRIMARY KEY,
    title       varchar(40) NOT NULL,
     imdbnumber     integer
);

INSERT INTO films (title, imdbnumber) VALUES ('Film a', 2000), ('Film b',
1999);

When using psql on the command line, I enter the following query:

select * from films order by (case 1 when 1 then 3 else 1 end);

I would expect this query to either sort on column 3, or refuse with an
error.
Instead it executes the query with incorrect sorting and no warning or
error.

Also when using indexes that are out of bounds, like -2 or 8, there are
neither warnings/errors nor sorting.

According to documentation,
https://www.postgresql.org/docs/9.3/static/queries-order.html, my select
query above is incorrect, however psql does not tell me this.

I don't see it as a problem, now that I know what PostGreSQL does, but I
would assume others without this knowledge would prefer at least a warning
or even an error.

Kind Regards,

Emiel Hermsen

Re: Case in Order By Ignored without warning or error

От
Francisco Olarte
Дата:
On Tue, Jun 7, 2016 at 2:58 PM, Emiel Hermsen <s32191234@gmail.com> wrote:
> Hello,
>
> I'm currently working with PostGreSQL 9.3 on a RedHat 6.6 device.
> One of my predecessors decided he wanted dynamic sorting which seems to be
> ignored.
>
> My made-up testing table definition is as follows:
> CREATE TABLE films (
>     id        SERIAL PRIMARY KEY,
>     title       varchar(40) NOT NULL,
>      imdbnumber     integer
> );
>
> INSERT INTO films (title, imdbnumber) VALUES ('Film a', 2000), ('Film b',
> 1999);
>
> When using psql on the command line, I enter the following query:
>
> select * from films order by (case 1 when 1 then 3 else 1 end);
>
> I would expect this query to either sort on column 3, or refuse with an
> error.
> Instead it executes the query with incorrect sorting and no warning or
> error.

Or does it execute it sorting by a constant value 3?

> Also when using indexes that are out of bounds, like -2 or 8, there are
> neither warnings/errors nor sorting.

Or does it sort by the constant value -2 / 8 ?

I say this because 3/-2/8 are valid ( although meaningless ) sort
keys, more on this....


> According to documentation,
> https://www.postgresql.org/docs/9.3/static/queries-order.html, my select
> query above is incorrect, however psql does not tell me this.

Psql just sends the queries to the server. OTOH, the page you quote
says near the top "The sort expression(s) can be any expression that
would be valid in the query's select list.", and given

xxx=# select version();
                                                          version

----------------------------------------------------------------------------------------------------------------------------
 PostgreSQL 9.3.10 on x86_64-pc-linux-gnu, compiled by
x86_64-pc-linux-gnu-gcc (Gentoo 4.9.3 p1.4, pie-0.6.4) 4.9.3, 64-bit
(1 row)

xxx=# select (case 1 when 1 then 3 else 1 end);
 case
------
    3
(1 row)

You are just sorting by a constant expression, like if you had a
column with the value 3 in every row.

The section for the order by clause in the page for the select command
states "Each expression can be the name or ordinal number of an output
column (SELECT list item), or it can be an arbitrary expression formed
from input-column values.", and I supose it goes the ordinal number
way only when it is a simple constant integer, I even doubt order by
1+0 would work ( because otherwise every integer-valued expresion
could be interpreted as an ordinal, so it seems to be interpreting it
( as I would expect ) as an arbitrary expression fomed from ( 0 )
input column values ).

Francisco Olarte.

Re: Case in Order By Ignored without warning or error

От
"David G. Johnston"
Дата:
On Tue, Jun 7, 2016 at 9:19 AM, Francisco Olarte <folarte@peoplecall.com>
wrote:

> On Tue, Jun 7, 2016 at 2:58 PM, Emiel Hermsen <s32191234@gmail.com> wrote=
:
> > Hello,
> >
> > I'm currently working with PostGreSQL 9.3 on a RedHat 6.6 device.
> > One of my predecessors decided he wanted dynamic sorting which seems to
> be
> > ignored.
>

=E2=80=8BPostgreSQL, please...=E2=80=8B


> >
> > My made-up testing table definition is as follows:
> > CREATE TABLE films (
> >     id        SERIAL PRIMARY KEY,
> >     title       varchar(40) NOT NULL,
> >      imdbnumber     integer
> > );
> >
> > INSERT INTO films (title, imdbnumber) VALUES ('Film a', 2000), ('Film b=
',
> > 1999);
> >
> > When using psql on the command line, I enter the following query:
> >
> > select * from films order by (case 1 when 1 then 3 else 1 end);
> >
> > I would expect this query to either sort on column 3, or refuse with an
> > error.
> > Instead it executes the query with incorrect sorting and no warning or
> > error.
> =E2=80=8B[...]=E2=80=8B
>
> > According to documentation,
> > https://www.postgresql.org/docs/9.3/static/queries-order.html, my selec=
t
> > query above is incorrect, however psql does not tell me this.
> =E2=80=8B[...]=E2=80=8B
>
> The section for the order by clause in the page for the select command
> states "Each expression can be the name or ordinal number of an output
> column (SELECT list item), or it can be an arbitrary expression formed
> from input-column values.", and I supose it goes the ordinal number
> way only when it is a simple constant integer, I even doubt order by
> 1+0 would work ( because otherwise every integer-valued expresion
> could be interpreted as an ordinal, so it seems to be interpreting it
> ( as I would expect ) as an arbitrary expression fomed from ( 0 )
> input column values ).
>

=E2=80=8BThis sentence, a couple below the one you quote, is either redunda=
nt or
imprecise.

"=E2=80=8BIt is also possible to use arbitrary expressions in the ORDER BY =
clause,
including columns that do not appear in the SELECT output list. Thus the
following statement is valid:"

If kept if should be written:

"It is also possible to use arbitrary expressions in the ORDER BY clause,
but those expressions cannot refer to column in the SELECT output list.
Thus the following statement is valid."

=E2=80=8BIn short, expressions are resolved and sorted on their result whil=
e
unadorned column names and literal integers are used as lookup values into
a column map and the values in the referenced columns are then sorted.

David J.

Re: Case in Order By Ignored without warning or error

От
Francisco Olarte
Дата:
Emiel:

1.- Please, do not top post. It does not matter too much in this case,
but makes the discussion extremely difficult to follow.

2.- This is not a bug, maybe you should move it to general.

And now, regarding your message....

On Wed, Jun 8, 2016 at 10:39 AM, Emiel Hermsen <s32191234@gmail.com> wrote:
> Based on the explanation give by David, one option would be to alter the
> documentation section linked and copied below. This because PostgreSQL
> accepts and executes the query, but will almost guaranteed not do what th=
e
> writer of the statement intends.
> ----------------------------------------------------------------
> Note that an output column name has to stand alone, that is, it cannot be
> used in an expression =E2=80=94 for example, this is not correct:
>
> SELECT a + b AS sum, c FROM table1 ORDER BY sum + c;          -- wrong
> ----------------------------------------------------------------

The alias problem is, IIRC, mandated by std compatibility. Anyway, you
can easily order by (a+b)+c ( not too sure about it, now that I think
). Anyway, the problem of languages not doing what the programmer
expects is common, and in my experience commonly caused by failure to
properly read the docs by the programmer.

> The second option, again just my opinion, would be to change the behavior
> where the ORDER BY clause refuses any contained content other than number=
s
> and column names combined with the ASC and DESC keywords.

I *strongly* disagree with that. Even if it was just because it will
make a lot of perfectly good code written by people who properly read
the docs before forming some expectations against which they code.
This is SQL, is a powerful, complicated language, and it has to be
learnt.

> My most important argument for this is that the code that led me to askin=
g
> this question has been implemented in 2003 and run in a production
> environment ever since.

This I accept, but has it been running well?

> Of course the edge case the ORDER BY was to cover, should have been prope=
rly
> tested and the programmer at the time should have known the restrictions =
on
> the order by statement. But I would argue that PostgreSQL will "never" do
> what the programmer has intended, in which case, preferably an error but =
at
> least, a warning would be warranted.

It normally does what the programmer intends. And, in your case, it
seems to me the programmer decided on a convoluted construct and put
it without doing an elementary test. I mean, something like this:

# values (2,20),(1,30),(3,10) order by 1;
 column1 | column2
---------+---------
       1 |      30
       2 |      20
       3 |      10
(3 rows)

# values (2,20),(1,30),(3,10) order by 2;
 column1 | column2
---------+---------
       3 |      10
       2 |      20
       1 |      30
(3 rows)

# values (2,20),(1,30),(3,10) order by case 1 when 1 then 1 else 2 end;
 column1 | column2
---------+---------
       2 |      20
       1 |      30
       3 |      10
(3 rows)

# values (2,20),(1,30),(3,10) order by case 2 when 1 then 1 else 2 end;
 column1 | column2
---------+---------
       2 |      20
       1 |      30
       3 |      10
(3 rows)

Easily shows you how it works ( it's known some places, liki
start/offset and group / rder by do not support the whole expression
syntax, so it's better to test ).


Regards.
   Francisco Olarte.

Re: Case in Order By Ignored without warning or error

От
Emiel Hermsen
Дата:
I'd like to thank you both for your responses.
These have helped me understand the behavior of PostgreSQL.

However, I am a little lost on what happens now or what I am to do now, so
I'm going out on a limb:

Forgive me for being bolt in stating my humble opinion below:
Based on the explanation give by David, one option would be to alter the
documentation section linked and copied below. This because PostgreSQL
accepts and executes the query, but will almost guaranteed not do what the
writer of the statement intends.

----------------------------------------------------------------
Note that an output column name has to stand alone, that is, it cannot be
used in an expression =E2=80=94 for example, this is not correct:

SELECT a + b AS sum, c FROM table1 ORDER BY sum + c;          --
wrong----------------------------------------------------------------

The second option, again just my opinion, would be to change the behavior
where the ORDER BY clause refuses any contained content other than numbers
and column names combined with the ASC and DESC keywords.

My most important argument for this is that the code that led me to asking
this question has been implemented in 2003 and run in a production
environment ever since.
Of course the edge case the ORDER BY was to cover, should have been
properly tested and the programmer at the time should have known the
restrictions on the order by statement. But I would argue that PostgreSQL
will "never" do what the programmer has intended, in which case, preferably
an error but at least, a warning would be warranted.

I would like to conclude with my apologies in advance in case any of the
above is out of line for this mailing list. this is my first bug report.

Thank you for your time,

Emiel Hermsen


2016-06-07 15:38 GMT+02:00 David G. Johnston <david.g.johnston@gmail.com>:

> On Tue, Jun 7, 2016 at 9:19 AM, Francisco Olarte <folarte@peoplecall.com>
> wrote:
>
>> On Tue, Jun 7, 2016 at 2:58 PM, Emiel Hermsen <s32191234@gmail.com>
>> wrote:
>> > Hello,
>> >
>> > I'm currently working with PostGreSQL 9.3 on a RedHat 6.6 device.
>> > One of my predecessors decided he wanted dynamic sorting which seems t=
o
>> be
>> > ignored.
>>
>
> =E2=80=8BPostgreSQL, please...=E2=80=8B
>
>
>> >
>> > My made-up testing table definition is as follows:
>> > CREATE TABLE films (
>> >     id        SERIAL PRIMARY KEY,
>> >     title       varchar(40) NOT NULL,
>> >      imdbnumber     integer
>> > );
>> >
>> > INSERT INTO films (title, imdbnumber) VALUES ('Film a', 2000), ('Film
>> b',
>> > 1999);
>> >
>> > When using psql on the command line, I enter the following query:
>> >
>> > select * from films order by (case 1 when 1 then 3 else 1 end);
>> >
>> > I would expect this query to either sort on column 3, or refuse with a=
n
>> > error.
>> > Instead it executes the query with incorrect sorting and no warning or
>> > error.
>> =E2=80=8B[...]=E2=80=8B
>>
>> > According to documentation,
>> > https://www.postgresql.org/docs/9.3/static/queries-order.html, my
>> select
>> > query above is incorrect, however psql does not tell me this.
>> =E2=80=8B[...]=E2=80=8B
>>
>> The section for the order by clause in the page for the select command
>> states "Each expression can be the name or ordinal number of an output
>> column (SELECT list item), or it can be an arbitrary expression formed
>> from input-column values.", and I supose it goes the ordinal number
>> way only when it is a simple constant integer, I even doubt order by
>> 1+0 would work ( because otherwise every integer-valued expresion
>> could be interpreted as an ordinal, so it seems to be interpreting it
>> ( as I would expect ) as an arbitrary expression fomed from ( 0 )
>> input column values ).
>>
>
> =E2=80=8BThis sentence, a couple below the one you quote, is either redun=
dant or
> imprecise.
>
> "=E2=80=8BIt is also possible to use arbitrary expressions in the ORDER B=
Y clause,
> including columns that do not appear in the SELECT output list. Thus the
> following statement is valid:"
>
> If kept if should be written:
>
> "It is also possible to use arbitrary expressions in the ORDER BY clause,
> but those expressions cannot refer to column in the SELECT output list.
> Thus the following statement is valid."
>
> =E2=80=8BIn short, expressions are resolved and sorted on their result wh=
ile
> unadorned column names and literal integers are used as lookup values int=
o
> a column map and the values in the referenced columns are then sorted.
>
> David J.
>
>

Re: Case in Order By Ignored without warning or error

От
Emiel Hermsen
Дата:
Fracisco,

2016-06-08 12:37 GMT+02:00 Francisco Olarte <folarte@peoplecall.com>:

> Emiel:
>
> 1.- Please, do not top post. It does not matter too much in this case,
> but makes the discussion extremely difficult to follow.
>
>
I did not think of this yet, thank you for your suggestion and the required
patience.


> 2.- This is not a bug, maybe you should move it to general.
>
> Agreed and I will, so for as far as I am concerned this "topic" can be
"closed" (I do not know a better way to describe it).


> And now, regarding your message....
>
> On Wed, Jun 8, 2016 at 10:39 AM, Emiel Hermsen <s32191234@gmail.com>
> wrote:
> > Based on the explanation give by David, one option would be to alter th=
e
> > documentation section linked and copied below. This because PostgreSQL
> > accepts and executes the query, but will almost guaranteed not do what
> the
> > writer of the statement intends.
> > ----------------------------------------------------------------
> > Note that an output column name has to stand alone, that is, it cannot =
be
> > used in an expression =E2=80=94 for example, this is not correct:
> >
> > SELECT a + b AS sum, c FROM table1 ORDER BY sum + c;          -- wrong
> > ----------------------------------------------------------------
>
> The alias problem is, IIRC, mandated by std compatibility. Anyway, you
> can easily order by (a+b)+c ( not too sure about it, now that I think
> ). Anyway, the problem of languages not doing what the programmer
> expects is common, and in my experience commonly caused by failure to
> properly read the docs by the programmer.
>

Understood. I did test the order by (a+b)+c with the statement: SELECT *
FROM films ORDER BY 1+1; which does not sort on the second column.
Therefore I assume that any construction like (a+b)+c will not work either.


> > The second option, again just my opinion, would be to change the behavi=
or
> > where the ORDER BY clause refuses any contained content other than
> numbers
> > and column names combined with the ASC and DESC keywords.
>
> I *strongly* disagree with that. Even if it was just because it will
> make a lot of perfectly good code written by people who properly read
> the docs before forming some expectations against which they code.
> This is SQL, is a powerful, complicated language, and it has to be
> learnt.
>

I do agree on your last statement about the difficulty.
My opinion in this matter is mostly based of my findings regarding the
"ORDER BY 1+1"  not doing anything.
However, this discussion should be followed up in the general section.


> > My most important argument for this is that the code that led me to
> asking
> > this question has been implemented in 2003 and run in a production
> > environment ever since.
>
> This I accept, but has it been running well?
>

The application has been doing well for multiple years now. However the
edge case that was to be covered by the ORDER BY in this query is likely to
have been overlooked due to the general error margin of the output data.


> > Of course the edge case the ORDER BY was to cover, should have been
> properly
> > tested and the programmer at the time should have known the restriction=
s
> on
> > the order by statement. But I would argue that PostgreSQL will "never" =
do
> > what the programmer has intended, in which case, preferably an error bu=
t
> at
> > least, a warning would be warranted.
>
> It normally does what the programmer intends. And, in your case, it
> seems to me the programmer decided on a convoluted construct and put
> it without doing an elementary test. I mean, something like this:
>
> # values (2,20),(1,30),(3,10) order by 1;
>  column1 | column2
> ---------+---------
>        1 |      30
>        2 |      20
>        3 |      10
> (3 rows)
>
> # values (2,20),(1,30),(3,10) order by 2;
>  column1 | column2
> ---------+---------
>        3 |      10
>        2 |      20
>        1 |      30
> (3 rows)
>
> # values (2,20),(1,30),(3,10) order by case 1 when 1 then 1 else 2 end;
>  column1 | column2
> ---------+---------
>        2 |      20
>        1 |      30
>        3 |      10
> (3 rows)
>
> # values (2,20),(1,30),(3,10) order by case 2 when 1 then 1 else 2 end;
>  column1 | column2
> ---------+---------
>        2 |      20
>        1 |      30
>        3 |      10
> (3 rows)
>
> Easily shows you how it works ( it's known some places, liki
> start/offset and group / rder by do not support the whole expression
> syntax, so it's better to test ).
>

Agreed, which is how I got to find the initial faulty query.


Thank you for your patients, and the responses and explanations.

Kind Regards,
Emiel Hermsen

Re: Case in Order By Ignored without warning or error

От
Tom Lane
Дата:
Francisco Olarte <folarte@peoplecall.com> writes:
> On Wed, Jun 8, 2016 at 10:39 AM, Emiel Hermsen <s32191234@gmail.com> wrote:
>> The second option, again just my opinion, would be to change the behavior
>> where the ORDER BY clause refuses any contained content other than numbers
>> and column names combined with the ASC and DESC keywords.

> I *strongly* disagree with that.

Yeah, there is very little chance we'd change the behavior here.  Some
context might help: the behavior whereby ORDER BY's arguments are names
or numbers of output columns was mandated in SQL92.  SQL99, and more
recent iterations of the standard, redefined ORDER BY's arguments as
being arbitrary expressions over the input columns.  Postgres attempts
to be compatible with both of those interpretations.  There's no way that
we would drop the SQL99 behavior, because that's required by current
spec.  Dropping the SQL92 behavior is also unpleasant to contemplate,
because it would break a lot of legacy code, and frankly "ORDER BY 1"
is just too convenient a shorthand to give up easily.  So we live with
the fact that the behavior is a bit ambiguous.

            regards, tom lane

Re: Case in Order By Ignored without warning or error

От
Francisco Olarte
Дата:
Hi Emiel:

On Wed, Jun 8, 2016 at 1:31 PM, Emiel Hermsen <s32191234@gmail.com> wrote:
> Understood. I did test the order by (a+b)+c with the statement: SELECT *
> FROM films ORDER BY 1+1; which does not sort on the second column. Therefore
> I assume that any construction like (a+b)+c will not work either.

mmm, aybe you misnterpreted your test result, order by 1+1 correctly
sorts by the expresion 1+1, = 2, so no sorting ( something that
happens on underspecified sort criteria ). So a+b+c or othres should
work too, as proven by 1+1. The problem is you thought 1+ select a
column where only naked names and single numbers do. I think even '+1'
does not do the same as '1'.

> I do agree on your last statement about the difficulty.
> My opinion in this matter is mostly based of my findings regarding the
> "ORDER BY 1+1"  not doing anything.

As before, it is doing a thing, sorting by a constant.


Francisco Olarte.

Re: Case in Order By Ignored without warning or error

От
"David G. Johnston"
Дата:
On Thu, Jun 9, 2016 at 10:58 AM, Francisco Olarte <folarte@peoplecall.com>
wrote:

> Hi Emiel:
>
> On Wed, Jun 8, 2016 at 1:31 PM, Emiel Hermsen <s32191234@gmail.com> wrote=
:
> > Understood. I did test the order by (a+b)+c with the statement: SELECT =
*
> > FROM films ORDER BY 1+1; which does not sort on the second column.
> Therefore
> > I assume that any construction like (a+b)+c will not work either.
>
> mmm, aybe you misnterpreted your test result, order by 1+1 correctly
> sorts by the expresion 1+1, =3D 2, so no sorting ( something that
> happens on underspecified sort criteria ). So a+b+c or othres should
> work too, as proven by 1+1. The problem is you thought 1+ select a
> column where only naked names and single numbers do. I think even '+1'
> does not do the same as '1'.
>
> > I do agree on your last statement about the difficulty.
> > My opinion in this matter is mostly based of my findings regarding the
> > "ORDER BY 1+1"  not doing anything.
>
> As before, it is doing a thing, sorting by a constant.
>
>
=E2=80=8BI think its a fair characterization to call "sorting on a constant=
" as
"doing nothing" or "not useful".  There is no noticeable difference between
that and omitting the constant.

David J.
=E2=80=8B

Re: Case in Order By Ignored without warning or error

От
Francisco Olarte
Дата:
David:

On Thu, Jun 9, 2016 at 5:15 PM, David G. Johnston
<david.g.johnston@gmail.com> wrote:
> I think its a fair characterization to call "sorting on a constant" as
> "doing nothing" or "not useful".  There is no noticeable difference between
> that and omitting the constant.

Yes, but in that context I wanted to point it was correctly parsing
and evaluating the expression and comparing the result ( the optimizer
may have zapped it, but this is transparent to the user ).


Francisco Olarte.