Обсуждение: ntile() throws ERROR when hashagg is false

Поиск
Список
Период
Сортировка

ntile() throws ERROR when hashagg is false

От
Rajkumar Raghuwanshi
Дата:
Hi

ntile() throws ERROR when hashagg is false, test case given below.

postgres=# create table foo (a int, b int, c text);
CREATE TABLE
postgres=# insert into foo select i%20, i%30, to_char(i%12, 'FM0000') from generate_series(0, 36) i;
INSERT 0 37
postgres=# explain select ntile(a) OVER () from foo GROUP BY a;
                            QUERY PLAN                            
-------------------------------------------------------------------
 WindowAgg  (cost=25.00..29.50 rows=200 width=8)
   ->  HashAggregate  (cost=25.00..27.00 rows=200 width=4)
         Group Key: a
         ->  Seq Scan on foo  (cost=0.00..22.00 rows=1200 width=4)
(4 rows)

postgres=# select ntile(a) OVER () from foo GROUP BY a;
 ntile
-------
     1
     1
     2
     2
     3
     3
     4
     4
     5
     5
     6
     6
     7
     7
     8
     8
     9
     9
    10
    11
(20 rows)

postgres=# set enable_hashagg to false;
SET
postgres=# explain select ntile(a) OVER () from foo GROUP BY a;
                               QUERY PLAN                               
-------------------------------------------------------------------------
 WindowAgg  (cost=83.37..91.87 rows=200 width=8)
   ->  Group  (cost=83.37..89.37 rows=200 width=4)
         Group Key: a
         ->  Sort  (cost=83.37..86.37 rows=1200 width=4)
               Sort Key: a
               ->  Seq Scan on foo  (cost=0.00..22.00 rows=1200 width=4)
(6 rows)
postgres=# select ntile(a) OVER () from foo GROUP BY a;
ERROR:  argument of ntile must be greater than zero

Thanks & Regards,
Rajkumar Raghuwanshi
QMG, EnterpriseDB Corporation

Re: ntile() throws ERROR when hashagg is false

От
Andrew Gierth
Дата:
>>>>> "Rajkumar" == Rajkumar Raghuwanshi <rajkumar.raghuwanshi@enterprisedb.com> writes:

 Rajkumar> Hi
 
 Rajkumar> ntile() throws ERROR when hashagg is false, test case given
 Rajkumar> below.

 Rajkumar> postgres=# create table foo (a int, b int, c text);
 Rajkumar> CREATE TABLE
 Rajkumar> postgres=# insert into foo select i%20, i%30, to_char(i%12, 'FM0000') from
 Rajkumar> generate_series(0, 36) i;
 Rajkumar> INSERT 0 37
 Rajkumar> postgres=# explain select ntile(a) OVER () from foo GROUP BY a;

This query isn't actually legal per the spec; the argument of ntile is
restricted to being a constant or parameter, so it can't change from row
to row. PG is more flexible, but that doesn't make the query any more
meaningful.

What I think pg is actually doing is taking the value of the ntile()
argument from the first row and using that for the whole partition.
In your example, enabling or disabling hashagg changes the order of the
input rows for the window function (since you've specified no ordering
in the window definition), and with hashagg off, you get the smallest
value of a first (which is 0 and thus an error).

-- 
Andrew (irc:RhodiumToad)


Re: ntile() throws ERROR when hashagg is false

От
David Rowley
Дата:
On 14 June 2018 at 18:57, Andrew Gierth <andrew@tao11.riddles.org.uk> wrote:
> What I think pg is actually doing is taking the value of the ntile()
> argument from the first row and using that for the whole partition.
> In your example, enabling or disabling hashagg changes the order of the
> input rows for the window function (since you've specified no ordering
> in the window definition), and with hashagg off, you get the smallest
> value of a first (which is 0 and thus an error).

Seems that's the case. I'd guess it was written that way so we could
allow PARAM_EXTERN Params rather than requiring the arg to be a Const.

I wonder if it would be worth adding a run-time check in
window_ntile() that causes an ERROR on first call if there are any
Vars or PARAM_EXEC Params in the function argument. An ERROR might be
better than doing something that the user does not expect.

Ideally, something would alert the user much sooner than the executor,
but I think doing it that way would be quite a bit more work.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: ntile() throws ERROR when hashagg is false

От
Tom Lane
Дата:
David Rowley <david.rowley@2ndquadrant.com> writes:
> On 14 June 2018 at 18:57, Andrew Gierth <andrew@tao11.riddles.org.uk> wrote:
>> What I think pg is actually doing is taking the value of the ntile()
>> argument from the first row and using that for the whole partition.

Yes, easily verified by looking at window_ntile(): the argument is only
examined on first call.

> I wonder if it would be worth adding a run-time check in
> window_ntile() that causes an ERROR on first call if there are any
> Vars or PARAM_EXEC Params in the function argument. An ERROR might be
> better than doing something that the user does not expect.

-1, that would break cases that are legal and useful, such as where a
PARAM_EXEC Param represents an outer-query-level variable, while still
failing to catch some problematic cases (eg. volatile functions).
I think also that there are cases that are not legal per spec but can
still be useful, as long as the user knows what they're doing.

It might be worth some documentation changes though.

            regards, tom lane


Re: ntile() throws ERROR when hashagg is false

От
Andrew Gierth
Дата:
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

 >> I wonder if it would be worth adding a run-time check in
 >> window_ntile() that causes an ERROR on first call if there are any
 >> Vars or PARAM_EXEC Params in the function argument. An ERROR might
 >> be better than doing something that the user does not expect.

 Tom> -1, that would break cases that are legal and useful, such as
 Tom> where a PARAM_EXEC Param represents an outer-query-level variable,
 Tom> while still failing to catch some problematic cases (eg. volatile
 Tom> functions).

The only sane run-time check (that I can think of) that could be applied
would be to check that the value is the same on each row of a partition.

 Tom> I think also that there are cases that are not legal per spec but
 Tom> can still be useful, as long as the user knows what they're doing.

Yes, it would make sense for example to allow the value to change
between partitions.

-- 
Andrew (irc:RhodiumToad)