Обсуждение: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

Поиск
Список
Период
Сортировка

Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Oliver Ford
Дата:
Adds the options RESPECT/IGNORE NULLS (null treatment clause) and FROM
FIRST/LAST to the non-aggregate window functions.

A previous patch
(https://www.postgresql.org/message-id/CA+=vxNa5_N1q5q5OkxC0aQnNdbo2Ru6GVw+86wk+oNsUNJDLig@mail.gmail.com)
partially implemented this feature. However, that patch worked by
adding the null treatment clause to the window frame's frameOptions
variable, and consequently had the limitation that it wasn't possible
to reuse a window frame definition in a single query where two
functions were called that had different null treatment options. This
meant that the patch was never committed. The attached path takes a
different approach which gets around this limitation.

For example, the following query would not work correctly with the
implementation in the old patch but does with the attached patch:

WITH cte (x) AS (
select null union select 1 union select 2)
SELECT x,
first_value(x) over w as with_default,
first_value(x) respect nulls over w as with_respect,
first_value(x) ignore nulls over w as with_ignore
from cte WINDOW w as (order by x nulls first rows between unbounded
preceding and unbounded following);
 x | with_default | with_respect | with_ignore
---+--------------+--------------+-------------
   |              |              |           1
 1 |              |              |           1
 2 |              |              |           1
(3 rows)


== Implementation ==

The patch adds two types to the pg_type catalog: "ignorenulls" and
"fromlast". These types are of the Boolean category, and work as
wrappers around the bool type. They are used as function arguments to
extra versions of the window functions that take additional boolean
arguments. RESPECT NULLS and FROM FIRST are ignored by the parser, but
IGNORE NULLS and FROM LAST lead to the extra versions being called
with arguments to ignore nulls and order from last.

== Testing ==

Updated documentation and added regression tests. All existing tests
pass. This change will need a catversion bump.
Thanks to Krasiyan Andreev for initially testing this patch.

Вложения

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
David Fetter
Дата:
On Fri, Jul 13, 2018 at 01:52:00PM +0100, Oliver Ford wrote:
> Adds the options RESPECT/IGNORE NULLS (null treatment clause) and FROM
> FIRST/LAST to the non-aggregate window functions.

Please find attached an updated version for OID drift.

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Вложения

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Krasiyan Andreev
Дата:
Hi,
Patch applies and compiles, all included tests and building of the docs pass.
I am using last version from more than two months ago in production environment with real data and I didn't find any bugs, 
so I'm marking this patch as ready for committer in the commitfest app.

На сб, 28.07.2018 г. в 22:00 ч. David Fetter <david@fetter.org> написа:
On Fri, Jul 13, 2018 at 01:52:00PM +0100, Oliver Ford wrote:
> Adds the options RESPECT/IGNORE NULLS (null treatment clause) and FROM
> FIRST/LAST to the non-aggregate window functions.

Please find attached an updated version for OID drift.

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Andrew Gierth
Дата:
>>>>> "Krasiyan" == Krasiyan Andreev <krasiyan@gmail.com> writes:

 Krasiyan> Hi,

 Krasiyan> Patch applies and compiles, all included tests and building
 Krasiyan> of the docs pass. I am using last version from more than two
 Krasiyan> months ago in production environment with real data and I
 Krasiyan> didn't find any bugs, so I'm marking this patch as ready for
 Krasiyan> committer in the commitfest app.

Unfortunately, reviewing it from a committer perspective - I can't
possibly commit this as it stands, and anything I did to it would be
basically a rewrite of much of it.

Some of the problems could be fixed. For example the type names could be
given pg_* prefixes (it's clearly not acceptable to create random
special-purpose boolean subtypes in pg_catalog and _not_ give them such
a prefix), and the precedence hackery in gram.y could have comments
added (gram.y is already bad enough; _anything_ fancy with precedence
has to be described in the comments). But I don't like that hack with
the special types at all, and I think that needs a better solution.

Normally I'd push hard to try and get some solution that's sufficiently
generic to allow user-defined functions to make use of the feature. But
I think the SQL spec people have managed to make that literally
impossible in this case, what with the FROM keyword appearing in the
middle of a production and not followed by anything sufficiently
distinctive to even use for extra token lookahead.

Also, as has been pointed out in a number of previous features, we're
starting to accumulate identifiers that are reserved in subtly different
ways from our basic four-category system (which is itself a significant
elaboration compared to the spec's simple reserved/unreserved
distinction). As I recall this objection was specifically raised for
CUBE, but justified there by the existence of the contrib/cube extension
(and the fact that the standard CUBE() construct is used only in very
specific places in the syntax). This patch would make lead / lag /
first_value / last_value / nth_value syntactically "special" while not
actually reserving them (beyond having them in unreserved_keywords); I
think serious consideration should be given to whether they should
instead become col_name_keywords (which would, I believe, make it
unnecessary to mess with precedence).

Anyone have any thoughts or comments on the above?

-- 
Andrew (irc:RhodiumToad)


Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Tom Lane
Дата:
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> Normally I'd push hard to try and get some solution that's sufficiently
> generic to allow user-defined functions to make use of the feature. But
> I think the SQL spec people have managed to make that literally
> impossible in this case, what with the FROM keyword appearing in the
> middle of a production and not followed by anything sufficiently
> distinctive to even use for extra token lookahead.

Yeah.  Is there any appetite for a "Just Say No" approach?  That is,
refuse to implement the spec's syntax on the grounds that it's too
brain-dead to even consider, and instead provide some less random,
more extensibility-friendly way to accomplish the same thing?

The FROM FIRST/LAST bit seems particularly badly thought through,
because AFAICS it is flat out ambiguous with a normal FROM clause
immediately following the window function call.  The only way to
make it not so would be to make FIRST and LAST be fully reserved,
which is neither a good idea nor spec-compliant.

In short, there's a really good case to be made here that the SQL
committee is completely clueless about syntax design, and so we
shouldn't follow this particular pied piper.

> ... This patch would make lead / lag /
> first_value / last_value / nth_value syntactically "special" while not
> actually reserving them (beyond having them in unreserved_keywords); I
> think serious consideration should be given to whether they should
> instead become col_name_keywords (which would, I believe, make it
> unnecessary to mess with precedence).

I agree that messing with the precedence rules is likely to have
unforeseen and undesirable side-effects.  Generally, if you need
to create a precedence rule, that's because your grammar is
ambiguous.  Precedence fixes that in a well-behaved way only for
cases that actually are very much like operator precedence rules.
Otherwise, you may just be papering over something that isn't
working very well.  See e.g. commits 670a6c7a2 and 12b716457
for past cases where we learned that the hard way.  (The latter
also points out that if you must have a precedence hack, it's
safer to hack individual rules than to stick precedences onto
terminal symbols.)  In the case at hand, since the proposed patch
doesn't make FIRST and LAST be fully reserved, it seems just about
certain that it can be made to misbehave, including failing on
queries that were and should remain legal.

            regards, tom lane


Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Andrew Gierth
Дата:
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

 Tom> The FROM FIRST/LAST bit seems particularly badly thought through,
 Tom> because AFAICS it is flat out ambiguous with a normal FROM clause
 Tom> immediately following the window function call. The only way to
 Tom> make it not so would be to make FIRST and LAST be fully reserved,
 Tom> which is neither a good idea nor spec-compliant.

In the actual spec syntax it's not ambiguous at all because NTH_VALUE is
a reserved word (as are LEAD, LAG, FIRST_VALUE and LAST_VALUE), and OVER
is a mandatory clause in its syntax, so a FROM appearing before the OVER
must be part of a FROM FIRST/LAST and not introducing a FROM-clause.

In our syntax, if we made NTH_VALUE etc. a col_name_keyword (and thus
not legal as a function name outside its own special syntax) it would
also become unambiguous.

i.e. given this token sequence (with . marking the current posision):

  select nth_value(x) . from first ignore 

if we know up front that "nth_value" is a window function and not any
other kind of function, we know that we have to shift the "from" rather
than reducing the select-list because we haven't seen an "over" yet.
(Neither "first" nor "ignore" are reserved, so "select foo(x) from first
ignore;" is a valid and complete query, and without reserving the
function name we'd need at least four tokens of lookahead to decide
otherwise.)

This is why I think the col_name_keyword option needs to be given
serious consideration - it still doesn't reserve the names as strongly
as the spec does, but enough to make the standard syntax work without
needing any dubious hacks.

-- 
Andrew (irc:RhodiumToad)


Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Tom Lane
Дата:
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
>  Tom> The FROM FIRST/LAST bit seems particularly badly thought through,
>  Tom> because AFAICS it is flat out ambiguous with a normal FROM clause
>  Tom> immediately following the window function call. The only way to
>  Tom> make it not so would be to make FIRST and LAST be fully reserved,
>  Tom> which is neither a good idea nor spec-compliant.

> In the actual spec syntax it's not ambiguous at all because NTH_VALUE is
> a reserved word (as are LEAD, LAG, FIRST_VALUE and LAST_VALUE), and OVER
> is a mandatory clause in its syntax, so a FROM appearing before the OVER
> must be part of a FROM FIRST/LAST and not introducing a FROM-clause.

Hmm ...

> In our syntax, if we made NTH_VALUE etc. a col_name_keyword (and thus
> not legal as a function name outside its own special syntax) it would
> also become unambiguous.
> i.e. given this token sequence (with . marking the current posision):
>   select nth_value(x) . from first ignore 
> if we know up front that "nth_value" is a window function and not any
> other kind of function, we know that we have to shift the "from" rather
> than reducing the select-list because we haven't seen an "over" yet.

I don't really find that to be a desirable solution, because quite aside
from the extensibility problem, it would mean that a lot of errors
become "syntax error" where we formerly gave a more useful message.

This does open up a thought about how to proceed, though.  I'd been
trying to think of a way to solve this using base_yylex's ability to
do some internal lookahead and change token types based on that.
If you just think of recognizing FROM FIRST/LAST, you get nowhere
because that's still legal in other contexts.  But if you were to
look for FROM followed by FIRST/LAST followed by IGNORE/RESPECT/OVER,
I think that could only validly happen in this syntax.  It'd take
some work to extend base_yylex to look ahead 2 tokens not one, but
I'm sure that could be done.  (You'd also need a lookahead rule to
match "IGNORE/RESPECT NULLS OVER", but that seems just as doable.)
Then the relevant productions use FROM_LA, IGNORE_LA, RESPECT_LA
instead of the corresponding bare tokens, and the grammar no longer
has an ambiguity problem.

            regards, tom lane


Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Andrew Gierth
Дата:
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

 Tom> If you just think of recognizing FROM FIRST/LAST, you get nowhere
 Tom> because that's still legal in other contexts. But if you were to
 Tom> look for FROM followed by FIRST/LAST followed by
 Tom> IGNORE/RESPECT/OVER, I think that could only validly happen in
 Tom> this syntax.

No; you need to go four tokens ahead in total, not three. Assuming
nth_value is unreserved, then

  select nth_value(x) from first ignore;

is a valid query that has nth_value(x) as an expression, "first" as a
table name and "ignore" as its alias. Only when you see NULLS after
IGNORE, or OVER after FIRST/LAST, do you know that you're looking at
a window function and not a from clause.

So FROM_LA would have to mean "FROM" followed by any of:

  FIRST IGNORE NULLS
  LAST IGNORE NULLS
  FIRST RESPECT NULLS
  LAST RESPECT NULLS
  FIRST OVER
  LAST OVER

Remember that while OVER is reserved, all of FIRST, LAST, RESPECT and
IGNORE are unreserved.

 Tom> It'd take some work to extend base_yylex to look ahead 2 tokens
 Tom> not one, but I'm sure that could be done. (You'd also need a
 Tom> lookahead rule to match "IGNORE/RESPECT NULLS OVER", but that
 Tom> seems just as doable.) Then the relevant productions use FROM_LA,
 Tom> IGNORE_LA, RESPECT_LA instead of the corresponding bare tokens,
 Tom> and the grammar no longer has an ambiguity problem.

Yeah, but at the cost of having to extend base_yylex to go 3 tokens
ahead (not 2) rather than the current single lookahead slot.

Doable, certainly (probably not much harder to do 3 than 2 actually)

-- 
Andrew (irc:RhodiumToad)


Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Tom Lane
Дата:
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
>  Tom> If you just think of recognizing FROM FIRST/LAST, you get nowhere
>  Tom> because that's still legal in other contexts. But if you were to
>  Tom> look for FROM followed by FIRST/LAST followed by
>  Tom> IGNORE/RESPECT/OVER, I think that could only validly happen in
>  Tom> this syntax.

> No; you need to go four tokens ahead in total, not three. Assuming
> nth_value is unreserved, then
>   select nth_value(x) from first ignore;
> is a valid query that has nth_value(x) as an expression, "first" as a
> table name and "ignore" as its alias.

No, because once IGNORE is a keyword, even unreserved, it's not legal
as an AS-less alias.  We'd be breaking queries like that no matter what.

(I know there are people around here who'd like to remove that
restriction, but it's not happening anytime soon IMO.)

            regards, tom lane


Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Andrew Gierth
Дата:
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

 >> select nth_value(x) from first ignore;

 Tom> No, because once IGNORE is a keyword, even unreserved, it's not
 Tom> legal as an AS-less alias.

That rule only applies in the select-list, not in the FROM clause; table
aliases in FROM are just ColId, so they can be anything except a fully
reserved or type_func_name keyword.

-- 
Andrew (irc:RhodiumToad)


Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Andrew Gierth
Дата:
So I've tried to rough out a decision tree for the various options on
how this might be implemented (discarding the "use precedence hacks"
option). Opinions? Additions?

(formatted for emacs outline-mode)

* 1. use lexical lookahead

  +: relatively straightforward parser changes
  +: no new reserved words
  +: has the option of working extensibly with all functions

  -: base_yylex needs extending to 3 lookahead tokens

** 1.1. Allow from/ignore clause on all (or all non-agg) window function calls

  If the clauses are legal on all window functions, what to do about existing
  window functions for which the clauses do not make sense?

*** 1.1.1. Ignore the clause when the function isn't aware of it

  +: simple
  -: somewhat surprising for users perhaps?

*** 1.1.2. Change the behavior of the windowapi in some consistent way

  Not sure if this can work.
  +: fairly simple(maybe?) and predictable
  -: changes the behavior of existing window functions

** 1.2. Allow from/ignore clause on only certain functions

  +: avoids any unexpected behavior
  -: needs some way to control what functions allow it

*** 1.2.1. Check the function name in parse analysis against a fixed list.

  +: simple
  -: not extensible

*** 1.2.2. Provide some option in CREATE FUNCTION

  +: extensible
  -: fairly intrusive, adding stuff to create function and pg_proc

*** 1.2.3. Do something magical with function argument types

  +: doesn't need changes in create function / pg_proc
  -: it's an ugly hack

* 2. reserve nth_value etc. as functions

  +: follows the spec reasonably well
  +: less of a hack than extending base_yylex

  -: new reserved words
  -: more parser rules
  -: not extensible

  (now goto 1.2.1)

* 3. "just say no" to the spec

  e.g. add new functions like lead_ignore_nulls(), or add extra boolean
  args to lead() etc. telling them to skip nulls

  +: simple
  -: doesn't conform to spec
  -: using extra args isn't quite the right semantics

-- 
Andrew (irc:RhodiumToad)


Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Tom Lane
Дата:
Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
> So I've tried to rough out a decision tree for the various options on
> how this might be implemented (discarding the "use precedence hacks"
> option). Opinions? Additions?

I think it'd be worth at least drafting an implementation for the
lexical-lookahead fix.  I think it's likely that we'll need to extend
base_yylex to do more lookahead in the future even if we don't do it
for this, given the SQL committee's evident love for COBOL-ish syntax
and lack of regard for what you can do in LALR(1).

The questions of how we interface to the individual window functions
are really independent of how we handle the parsing problem.  My
first inclination is to just pass the flags down to the window functions
(store them in WindowObject and provide some additional inquiry functions
in windowapi.h) and let them deal with it.

>   If the clauses are legal on all window functions, what to do about existing
>   window functions for which the clauses do not make sense?

Option 1: do nothing, document that nothing happens if w.f. doesn't
implement it.

Option 2: record whether the inquiry functions got called.  At end of
query, error out if they weren't and the options were used.

It's also worth wondering if we couldn't just implement the flags in
some generic fashion and not need to involve the window functions at
all.  FROM LAST, for example, could and perhaps should be implemented
by inverting the sort order.  Possibly IGNORE NULLS could be implemented
inside the WinGetFuncArgXXX functions?  These behaviors might or might
not make much sense with other window functions, but that doesn't seem
like it's our problem.

            regards, tom lane


Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Andrew Gierth
Дата:
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

 >> So I've tried to rough out a decision tree for the various options
 >> on how this might be implemented (discarding the "use precedence
 >> hacks" option). Opinions? Additions?

 Tom> I think it'd be worth at least drafting an implementation for the
 Tom> lexical-lookahead fix. I think it's likely that we'll need to
 Tom> extend base_yylex to do more lookahead in the future even if we
 Tom> don't do it for this, given the SQL committee's evident love for
 Tom> COBOL-ish syntax and lack of regard for what you can do in
 Tom> LALR(1).

That's not _quite_ a fair criticism of the SQL committee; they're not
ignoring the capabilities of parsers, they're just not making their
syntax robust in the presence of the kinds of extensions and
generalizations that we want to do.

Without exception that I know of, every time we've run into a problem
that needed ugly precedence rules or extra lookahead it's been caused by
us not reserving something that is reserved in the spec, or by us
allowing constructs that are not in the spec at all (postfix operators,
especially), or by us deliberately generalizing what the spec allows
(e.g. allowing full expressions where the spec only allows column
references, or allowing extra parens around subselects) or where we've
repurposed syntax from the spec in an incompatible way (as in
ANY(array)).

Anyway, for the time being I will mark this patch as "returned with
feedback".

 >> If the clauses are legal on all window functions, what to do about
 >> existing window functions for which the clauses do not make sense?

 Tom> Option 1: do nothing, document that nothing happens if w.f.
 Tom> doesn't implement it.

That was 1.1.1 on my list.

 Tom> Option 2: record whether the inquiry functions got called. At end
 Tom> of query, error out if they weren't and the options were used.

Erroring at the _end_ of the query seems a bit of a potential surprise.

 Tom> It's also worth wondering if we couldn't just implement the flags
 Tom> in some generic fashion and not need to involve the window
 Tom> functions at all.

That was what I meant by option 1.1.2 on my list.

 Tom> FROM LAST, for example, could and perhaps should be implemented by
 Tom> inverting the sort order.

Actually that can't work for reasons brought up in the recent discussion
of optimization of window function sorts: if you change the sort order
you potentially disturb the ordering of peer rows, and the spec requires
that an (nth_value(x,n) from last over w) and (otherfunc(x) over w) for
order-equivalent windows "w" must see the peer rows in the same order.

So FROM LAST really does have to keep the original sort order, and count
backwards from the end of the window.

 Tom> Possibly IGNORE NULLS could be implemented inside the
 Tom> WinGetFuncArgXXX functions? These behaviors might or might not
 Tom> make much sense with other window functions, but that doesn't seem
 Tom> like it's our problem.

That's about what I was thinking for option 1.1.2, yes.

-- 
Andrew (irc:RhodiumToad)


Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Andrew Gierth
Дата:
>>>>> "Krasiyan" == Krasiyan Andreev <krasiyan@gmail.com> writes:

 Krasiyan> I am using last version from more than two months ago in
 Krasiyan> production environment with real data and I didn't find any
 Krasiyan> bugs, so I'm marking this patch as ready for committer in the
 Krasiyan> commitfest app.

Oliver (or anyone else), do you plan to continue working on this in the
immediate future, in line with the comments from myself and Tom in this
thread? If so I'll bump it to the next CF, otherwise I'll mark it
"returned with feedback".

I'm happy to help out with further work on this patch if needed, time
permitting.

-- 
Andrew (irc:RhodiumToad)


Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Stephen Frost
Дата:
Greetings,

This seems to have died out, and that's pretty unfortunate because this
is awfully useful SQL standard syntax that people look for and wish we
had.

* Andrew Gierth (andrew@tao11.riddles.org.uk) wrote:
> So I've tried to rough out a decision tree for the various options on
> how this might be implemented (discarding the "use precedence hacks"
> option). Opinions? Additions?
>
> (formatted for emacs outline-mode)
>
> * 1. use lexical lookahead
>
>   +: relatively straightforward parser changes
>   +: no new reserved words
>   +: has the option of working extensibly with all functions
>
>   -: base_yylex needs extending to 3 lookahead tokens

This sounds awful grotty and challenging to do and get right, and the
alternative (just reserving these, as the spec does) doesn't seem so
draconian as to be that much of an issue.

> * 2. reserve nth_value etc. as functions
>
>   +: follows the spec reasonably well
>   +: less of a hack than extending base_yylex
>
>   -: new reserved words
>   -: more parser rules
>   -: not extensible
>

For my 2c, at least, reserving these strikes me as entirely reasonable.
Yes, it sucks that we have to partially-reserve some additional
keywords, but such is life.  I get that we'll throw syntax errors
sometimes when we might have given a better error, but I think we can
accept that.

>   (now goto 1.2.1)

Hmm, not sure this was right?  but sure, I'll try...

> *** 1.2.1. Check the function name in parse analysis against a fixed list.
>
>   +: simple
>   -: not extensible

Seems like this is more-or-less required since we'd be reserving them..?

> *** 1.2.2. Provide some option in CREATE FUNCTION
>
>   +: extensible
>   -: fairly intrusive, adding stuff to create function and pg_proc

How would this work though, if we reserve the functions as keywords..?
Maybe I'm not entirely following, but wouldn't attempts to use other
functions end up with syntax errors in at least some of the cases,
meaning that having other functions support this wouldn't really work?
I don't particularly like the idea that some built-in functions would
always work but others would work but only some of the time.

> *** 1.2.3. Do something magical with function argument types
>
>   +: doesn't need changes in create function / pg_proc
>   -: it's an ugly hack

Not really a fan of 'ugly hack'.

> * 3. "just say no" to the spec
>
>   e.g. add new functions like lead_ignore_nulls(), or add extra boolean
>   args to lead() etc. telling them to skip nulls
>
>   +: simple
>   -: doesn't conform to spec
>   -: using extra args isn't quite the right semantics

Ugh, no thank you.

Thanks!

Stephen

Вложения

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Krasiyan Andreev
Дата:
Thank you very much for feedback and yes, that is very useful SQL syntax.
Maybe you miss my previous answer, but you are right, that patch is currently dead,
because some important design questions must be discussed here, before patch rewriting.

I have dropped support of from first/last for nth_value(),
but also I reimplemented it in a different way,
by using negative number for the position argument, to be able to get the same frame in exact reverse order.
After that patch becomes much more simple and some concerns about precedence hack has gone.

I have not renamed special bool type "ignorenulls"
(I know that it is not acceptable way for calling extra version
of window functions, but also it makes things very easy and it can reuse frames),
but I removed the other special bool type "fromlast".

Attached file was for PostgreSQL 13 (master git branch, last commit fest),
everything was working and patch was at the time in very good shape, all tests was passed.

I read previous review and suggestions from Tom about special bool type and unreserved keywords and also,
that IGNORE NULLS could be implemented inside the WinGetFuncArgXXX functions,
but I am not sure how exactly to proceed (some example will be very helpful).

На чт, 30.04.2020 г. в 21:58 Stephen Frost <sfrost@snowman.net> написа:
Greetings,

This seems to have died out, and that's pretty unfortunate because this
is awfully useful SQL standard syntax that people look for and wish we
had.

* Andrew Gierth (andrew@tao11.riddles.org.uk) wrote:
> So I've tried to rough out a decision tree for the various options on
> how this might be implemented (discarding the "use precedence hacks"
> option). Opinions? Additions?
>
> (formatted for emacs outline-mode)
>
> * 1. use lexical lookahead
>
>   +: relatively straightforward parser changes
>   +: no new reserved words
>   +: has the option of working extensibly with all functions
>
>   -: base_yylex needs extending to 3 lookahead tokens

This sounds awful grotty and challenging to do and get right, and the
alternative (just reserving these, as the spec does) doesn't seem so
draconian as to be that much of an issue.

> * 2. reserve nth_value etc. as functions
>
>   +: follows the spec reasonably well
>   +: less of a hack than extending base_yylex
>
>   -: new reserved words
>   -: more parser rules
>   -: not extensible
>

For my 2c, at least, reserving these strikes me as entirely reasonable.
Yes, it sucks that we have to partially-reserve some additional
keywords, but such is life.  I get that we'll throw syntax errors
sometimes when we might have given a better error, but I think we can
accept that.

>   (now goto 1.2.1)

Hmm, not sure this was right?  but sure, I'll try...

> *** 1.2.1. Check the function name in parse analysis against a fixed list.
>
>   +: simple
>   -: not extensible

Seems like this is more-or-less required since we'd be reserving them..?

> *** 1.2.2. Provide some option in CREATE FUNCTION
>
>   +: extensible
>   -: fairly intrusive, adding stuff to create function and pg_proc

How would this work though, if we reserve the functions as keywords..?
Maybe I'm not entirely following, but wouldn't attempts to use other
functions end up with syntax errors in at least some of the cases,
meaning that having other functions support this wouldn't really work?
I don't particularly like the idea that some built-in functions would
always work but others would work but only some of the time.

> *** 1.2.3. Do something magical with function argument types
>
>   +: doesn't need changes in create function / pg_proc
>   -: it's an ugly hack

Not really a fan of 'ugly hack'.

> * 3. "just say no" to the spec
>
>   e.g. add new functions like lead_ignore_nulls(), or add extra boolean
>   args to lead() etc. telling them to skip nulls
>
>   +: simple
>   -: doesn't conform to spec
>   -: using extra args isn't quite the right semantics

Ugh, no thank you.

Thanks!

Stephen
Вложения

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Tatsuo Ishii
Дата:
I revisited the thread:
https://www.postgresql.org/message-id/flat/CAGMVOdsbtRwE_4%2Bv8zjH1d9xfovDeQAGLkP_B6k69_VoFEgX-A%40mail.gmail.com

and came up with attached POC patch (I used some varibale names
appearing in the Krasiyan Andreev's patch). I really love to have
RESPECT/IGNORE NULLS because I believe they are convenient for
users. For FIRST/LAST I am not so excited since there are alternatives
as our document stats, so FIRST/LAST are not included in the patch.

Currently in the patch only nth_value is allowed to use RESPECT/IGNORE
NULLS. I think it's not hard to implement it for others (lead, lag,
first_value and last_value).  No document nor test patches are
included for now.

Note that RESPECT/IGNORE are not registered as reserved keywords in
this patch (but registered as unreserved keywords). I am not sure if
this is acceptable or not.

> The questions of how we interface to the individual window functions
> are really independent of how we handle the parsing problem.  My
> first inclination is to just pass the flags down to the window functions
> (store them in WindowObject and provide some additional inquiry functions
> in windowapi.h) and let them deal with it.

I agree with this.  Also I do not change the prototype of
nth_value. So I pass RESPECT/IGNORE NULLS information from the raw
parser to parse/analysis and finally to WindowObject.

> It's also worth wondering if we couldn't just implement the flags in
> some generic fashion and not need to involve the window functions at
> all.  FROM LAST, for example, could and perhaps should be implemented
> by inverting the sort order.  Possibly IGNORE NULLS could be implemented
> inside the WinGetFuncArgXXX functions?  These behaviors might or might
> not make much sense with other window functions, but that doesn't seem
> like it's our problem.

Yes, probably we could make WinGetFuncArgXXX a little bit smarter in
this direction (not implemented in the patch at this point).

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
From 07f01f8859e159c908ada72e8f53daf51e0b8bdf Mon Sep 17 00:00:00 2001
From: Tatsuo Ishii <ishii@postgresql.org>
Date: Sat, 22 Apr 2023 16:52:50 +0900
Subject: [PATCH v1 1/3] Implement IGNORE or RESPECT NULLS parse/analysis part.

Implement SQL standard's IGNORE/RESPECT NULLS clause for window functions.
For now, only nth_value() can use this option.
---
 src/backend/parser/gram.y       | 22 ++++++++++++++++++----
 src/backend/parser/parse_func.c | 13 +++++++++++++
 src/include/nodes/parsenodes.h  |  8 ++++++++
 src/include/nodes/primnodes.h   |  2 ++
 src/include/parser/kwlist.h     |  2 ++
 5 files changed, 43 insertions(+), 4 deletions(-)

diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index acf6cf4866..2980ecd666 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -276,6 +276,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
     MergeWhenClause *mergewhen;
     struct KeyActions *keyactions;
     struct KeyAction *keyaction;
+    NullTreatment    nulltreatment;
 }
 
 %type <node>    stmt toplevel_stmt schema_stmt routine_body_stmt
@@ -661,6 +662,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
                     json_object_constructor_null_clause_opt
                     json_array_constructor_null_clause_opt
 
+%type <nulltreatment>        opt_null_treatment
+
 /*
  * Non-keyword token types.  These are hard-wired into the "flex" lexer.
  * They must be listed first so that their numeric codes do not depend on
@@ -718,7 +721,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 
     HANDLER HAVING HEADER_P HOLD HOUR_P
 
-    IDENTITY_P IF_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IMPORT_P IN_P INCLUDE
+    IDENTITY_P IF_P IGNORE_P ILIKE IMMEDIATE IMMUTABLE IMPLICIT_P IMPORT_P IN_P INCLUDE
     INCLUDING INCREMENT INDENT INDEX INDEXES INHERIT INHERITS INITIALLY INLINE_P
     INNER_P INOUT INPUT_P INSENSITIVE INSERT INSTEAD INT_P INTEGER
     INTERSECT INTERVAL INTO INVOKER IS ISNULL ISOLATION
@@ -752,7 +755,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
 
     RANGE READ REAL REASSIGN RECHECK RECURSIVE REF_P REFERENCES REFERENCING
     REFRESH REINDEX RELATIVE_P RELEASE RENAME REPEATABLE REPLACE REPLICA
-    RESET RESTART RESTRICT RETURN RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK ROLLUP
+    RESET RESPECT RESTART RESTRICT RETURN RETURNING RETURNS REVOKE RIGHT ROLE ROLLBACK ROLLUP
     ROUTINE ROUTINES ROW ROWS RULE
 
     SAVEPOINT SCALAR SCHEMA SCHEMAS SCROLL SEARCH SECOND_P SECURITY SELECT
@@ -15213,7 +15216,7 @@ func_application: func_name '(' ')'
  * (Note that many of the special SQL functions wouldn't actually make any
  * sense as functional index entries, but we ignore that consideration here.)
  */
-func_expr: func_application within_group_clause filter_clause over_clause
+func_expr: func_application within_group_clause filter_clause opt_null_treatment over_clause
                 {
                     FuncCall   *n = (FuncCall *) $1;
 
@@ -15246,7 +15249,8 @@ func_expr: func_application within_group_clause filter_clause over_clause
                         n->agg_within_group = true;
                     }
                     n->agg_filter = $3;
-                    n->over = $4;
+                    n->null_treatment = $4;
+                    n->over = $5;
                     $$ = (Node *) n;
                 }
             | json_aggregate_func filter_clause over_clause
@@ -15790,6 +15794,14 @@ filter_clause:
             | /*EMPTY*/                                { $$ = NULL; }
         ;
 
+/*
+ * Window function option clauses
+ */
+opt_null_treatment:
+            RESPECT NULLS_P                            { $$ = RESPECT_NULLS; }
+            | IGNORE_P NULLS_P                        { $$ = IGNORE_NULLS; }
+            | /*EMPTY*/                                { $$ = NULL_TREATMENT_NOT_SET; }
+        ;
 
 /*
  * Window Definitions
@@ -17111,6 +17123,7 @@ unreserved_keyword:
             | HOUR_P
             | IDENTITY_P
             | IF_P
+            | IGNORE_P
             | IMMEDIATE
             | IMMUTABLE
             | IMPLICIT_P
@@ -17223,6 +17236,7 @@ unreserved_keyword:
             | REPLACE
             | REPLICA
             | RESET
+            | RESPECT
             | RESTART
             | RESTRICT
             | RETURN
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index b3f0b6a137..92af0d10f1 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -31,6 +31,7 @@
 #include "parser/parse_target.h"
 #include "parser/parse_type.h"
 #include "utils/builtins.h"
+#include "utils/fmgroids.h"
 #include "utils/lsyscache.h"
 #include "utils/syscache.h"
 
@@ -99,6 +100,7 @@ ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs,
     bool        agg_distinct = (fn ? fn->agg_distinct : false);
     bool        func_variadic = (fn ? fn->func_variadic : false);
     CoercionForm funcformat = (fn ? fn->funcformat : COERCE_EXPLICIT_CALL);
+    NullTreatment null_treatment = (fn ? fn->null_treatment : NULL_TREATMENT_NOT_SET);
     bool        could_be_projection;
     Oid            rettype;
     Oid            funcid;
@@ -534,6 +536,13 @@ ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs,
                      errmsg("window function %s cannot have WITHIN GROUP",
                             NameListToString(funcname)),
                      parser_errposition(pstate, location)));
+        /* Check RESPECT NULLS or IGNORE NULLS is specified. They are only allowed with nth_value */
+        if (null_treatment != NULL_TREATMENT_NOT_SET && funcid != F_NTH_VALUE)
+            ereport(ERROR,
+                    (errcode(ERRCODE_WRONG_OBJECT_TYPE),
+                     errmsg("window function %s cannot have RESPECT NULLS or IGNORE NULLS",
+                            NameListToString(funcname)),
+                     parser_errposition(pstate, location)));
     }
     else if (fdresult == FUNCDETAIL_COERCION)
     {
@@ -835,6 +844,10 @@ ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs,
         wfunc->winagg = (fdresult == FUNCDETAIL_AGGREGATE);
         wfunc->aggfilter = agg_filter;
         wfunc->location = location;
+        if (null_treatment == IGNORE_NULLS)
+            wfunc->ignorenulls = true;
+        else
+            wfunc->ignorenulls = false;
 
         /*
          * agg_star is allowed for aggregate functions but distinct isn't
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index cc7b32b279..f13ae26a24 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -404,6 +404,13 @@ typedef struct RoleSpec
     int            location;        /* token location, or -1 if unknown */
 } RoleSpec;
 
+typedef enum NullTreatment
+{
+    NULL_TREATMENT_NOT_SET = 0,
+    RESPECT_NULLS,
+    IGNORE_NULLS
+} NullTreatment;
+
 /*
  * FuncCall - a function or aggregate invocation
  *
@@ -431,6 +438,7 @@ typedef struct FuncCall
     bool        agg_distinct;    /* arguments were labeled DISTINCT */
     bool        func_variadic;    /* last argument was labeled VARIADIC */
     CoercionForm funcformat;    /* how to display this node */
+    NullTreatment null_treatment;    /* RESPECT_NULLS or IGNORE NULLS */
     int            location;        /* token location, or -1 if unknown */
 } FuncCall;
 
diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h
index be9c29f0bf..213297dbd3 100644
--- a/src/include/nodes/primnodes.h
+++ b/src/include/nodes/primnodes.h
@@ -559,6 +559,8 @@ typedef struct WindowFunc
     bool        winstar pg_node_attr(query_jumble_ignore);
     /* is function a simple aggregate? */
     bool        winagg pg_node_attr(query_jumble_ignore);
+    /* true if IGNORE NULLS, false if RESPECT NULLS */
+    bool        ignorenulls pg_node_attr(query_jumble_ignore);
     /* token location, or -1 if unknown */
     int            location;
 } WindowFunc;
diff --git a/src/include/parser/kwlist.h b/src/include/parser/kwlist.h
index f5b2e61ca5..c7e61a8f0e 100644
--- a/src/include/parser/kwlist.h
+++ b/src/include/parser/kwlist.h
@@ -198,6 +198,7 @@ PG_KEYWORD("hold", HOLD, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("hour", HOUR_P, UNRESERVED_KEYWORD, AS_LABEL)
 PG_KEYWORD("identity", IDENTITY_P, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("if", IF_P, UNRESERVED_KEYWORD, BARE_LABEL)
+PG_KEYWORD("ignore", IGNORE_P, UNRESERVED_KEYWORD, AS_LABEL)
 PG_KEYWORD("ilike", ILIKE, TYPE_FUNC_NAME_KEYWORD, BARE_LABEL)
 PG_KEYWORD("immediate", IMMEDIATE, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("immutable", IMMUTABLE, UNRESERVED_KEYWORD, BARE_LABEL)
@@ -360,6 +361,7 @@ PG_KEYWORD("repeatable", REPEATABLE, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("replace", REPLACE, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("replica", REPLICA, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("reset", RESET, UNRESERVED_KEYWORD, BARE_LABEL)
+PG_KEYWORD("respect", RESPECT, UNRESERVED_KEYWORD, AS_LABEL)
 PG_KEYWORD("restart", RESTART, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("restrict", RESTRICT, UNRESERVED_KEYWORD, BARE_LABEL)
 PG_KEYWORD("return", RETURN, UNRESERVED_KEYWORD, BARE_LABEL)
-- 
2.25.1

From 3dc6f4bb897f76247589db018716bf5680d5331c Mon Sep 17 00:00:00 2001
From: Tatsuo Ishii <ishii@postgresql.org>
Date: Sat, 22 Apr 2023 16:58:48 +0900
Subject: [PATCH v1 2/3] Implement IGNORE or RESPECT NULLS planner part.

---
 src/backend/optimizer/util/clauses.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index a9c7bc342e..40fc62c447 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -2474,6 +2474,7 @@ eval_const_expressions_mutator(Node *node,
                 newexpr->winref = expr->winref;
                 newexpr->winstar = expr->winstar;
                 newexpr->winagg = expr->winagg;
+                newexpr->ignorenulls = expr->ignorenulls;
                 newexpr->location = expr->location;
 
                 return (Node *) newexpr;
-- 
2.25.1

From a78feec9bf7b08c644c2b3089b2de9237d4fcd9e Mon Sep 17 00:00:00 2001
From: Tatsuo Ishii <ishii@postgresql.org>
Date: Sat, 22 Apr 2023 17:00:06 +0900
Subject: [PATCH v1 3/3] Implement IGNORE or RESPECT NULLS executor and window
 functions part.

---
 src/backend/executor/nodeWindowAgg.c | 11 ++++++++++
 src/backend/utils/adt/windowfuncs.c  | 30 +++++++++++++++++++++++++---
 src/include/windowapi.h              |  2 ++
 3 files changed, 40 insertions(+), 3 deletions(-)

diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index 3ac581a711..7e2affb12c 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -69,6 +69,7 @@ typedef struct WindowObjectData
     int            readptr;        /* tuplestore read pointer for this fn */
     int64        markpos;        /* row that markptr is positioned on */
     int64        seekpos;        /* row that readptr is positioned on */
+    WindowFunc    *wfunc;            /* WindowFunc of this function */
 } WindowObjectData;
 
 /*
@@ -2617,6 +2618,7 @@ ExecInitWindowAgg(WindowAgg *node, EState *estate, int eflags)
             winobj->winstate = winstate;
             winobj->argstates = wfuncstate->args;
             winobj->localmem = NULL;
+            winobj->wfunc = wfunc;
             perfuncstate->winobj = winobj;
 
             /* It's a real window function, so set up to call it. */
@@ -3620,3 +3622,12 @@ WinGetFuncArgCurrent(WindowObject winobj, int argno, bool *isnull)
     return ExecEvalExpr((ExprState *) list_nth(winobj->argstates, argno),
                         econtext, isnull);
 }
+
+/*
+ * Return current WindowFunc
+ */
+WindowFunc    *
+WinGetWindowFunc(WindowObject winobj)
+{
+    return winobj->wfunc;
+}
diff --git a/src/backend/utils/adt/windowfuncs.c b/src/backend/utils/adt/windowfuncs.c
index b87a624fb2..919295ba13 100644
--- a/src/backend/utils/adt/windowfuncs.c
+++ b/src/backend/utils/adt/windowfuncs.c
@@ -693,6 +693,7 @@ window_nth_value(PG_FUNCTION_ARGS)
     bool        const_offset;
     Datum        result;
     bool        isnull;
+    bool        isout;
     int32        nth;
 
     nth = DatumGetInt32(WinGetFuncArgCurrent(winobj, 1, &isnull));
@@ -705,9 +706,32 @@ window_nth_value(PG_FUNCTION_ARGS)
                 (errcode(ERRCODE_INVALID_ARGUMENT_FOR_NTH_VALUE),
                  errmsg("argument of nth_value must be greater than zero")));
 
-    result = WinGetFuncArgInFrame(winobj, 0,
-                                  nth - 1, WINDOW_SEEK_HEAD, const_offset,
-                                  &isnull, NULL);
+    if (WinGetWindowFunc(winobj)->ignorenulls)
+    {
+        int        i, n;
+
+        i = n = 0;
+
+        for (;;)
+        {
+            result = WinGetFuncArgInFrame(winobj, 0,
+                                          i++, WINDOW_SEEK_HEAD, false,
+                                          &isnull, &isout);
+            if (isout)
+                break;
+
+            if (!isnull)
+            {
+                if (n == nth - 1)
+                    break;
+                n++;
+            }
+        }
+    }
+    else
+        result = WinGetFuncArgInFrame(winobj, 0,
+                                      nth - 1, WINDOW_SEEK_HEAD, const_offset,
+                                      &isnull, NULL);
     if (isnull)
         PG_RETURN_NULL();
 
diff --git a/src/include/windowapi.h b/src/include/windowapi.h
index b8c2c565d1..64f7d4c84d 100644
--- a/src/include/windowapi.h
+++ b/src/include/windowapi.h
@@ -61,4 +61,6 @@ extern Datum WinGetFuncArgInFrame(WindowObject winobj, int argno,
 extern Datum WinGetFuncArgCurrent(WindowObject winobj, int argno,
                                   bool *isnull);
 
+extern WindowFunc    *WinGetWindowFunc(WindowObject winobj);
+
 #endif                            /* WINDOWAPI_H */
-- 
2.25.1


Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Oliver Ford
Дата:


On Sat, 22 Apr 2023, 13:14 Tatsuo Ishii, <ishii@sraoss.co.jp> wrote:
I revisited the thread:
https://www.postgresql.org/message-id/flat/CAGMVOdsbtRwE_4%2Bv8zjH1d9xfovDeQAGLkP_B6k69_VoFEgX-A%40mail.gmail.com

and came up with attached POC patch (I used some varibale names
appearing in the Krasiyan Andreev's patch). I really love to have
RESPECT/IGNORE NULLS because I believe they are convenient for
users. For FIRST/LAST I am not so excited since there are alternatives
as our document stats, so FIRST/LAST are not included in the patch.

Currently in the patch only nth_value is allowed to use RESPECT/IGNORE
NULLS. I think it's not hard to implement it for others (lead, lag,
first_value and last_value).  No document nor test patches are
included for now.

I've actually recently been looking at this feature again recently as well. One thing I wondered, but would need consensus, is to take the SEEK_HEAD/SEEK_TAIL case statements out of WinGetFuncArgInPartition. This function is only called by leadlag_common, which uses SEEK_CURRENT, so those case statements are never reached. Taking them out simplifies the code as it is but means future features might need it re-added (although I'm not sure the use case for it, as that function is for window funcs that ignore the frame options).


Note that RESPECT/IGNORE are not registered as reserved keywords in
this patch (but registered as unreserved keywords). I am not sure if
this is acceptable or not.

> The questions of how we interface to the individual window functions
> are really independent of how we handle the parsing problem.  My
> first inclination is to just pass the flags down to the window functions
> (store them in WindowObject and provide some additional inquiry functions
> in windowapi.h) and let them deal with it.
I agree with this.  Also I do not change the prototype of
nth_value. So I pass RESPECT/IGNORE NULLS information from the raw
parser to parse/analysis and finally to WindowObject.

This is a much better option than my older patch which needed to change the functions.


> It's also worth wondering if we couldn't just implement the flags in
> some generic fashion and not need to involve the window functions at
> all.  FROM LAST, for example, could and perhaps should be implemented
> by inverting the sort order.  Possibly IGNORE NULLS could be implemented
> inside the WinGetFuncArgXXX functions?  These behaviors might or might
> not make much sense with other window functions, but that doesn't seem
> like it's our problem.

Yes, probably we could make WinGetFuncArgXXX a little bit smarter in
this direction (not implemented in the patch at this point).

+1 for doing it here. Maybe also refactor WinGetFuncArgInFrame, putting the exclusion checks in a static function as that function is already pretty big?


Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Vik Fearing
Дата:
On 4/22/23 14:14, Tatsuo Ishii wrote:
> I revisited the thread:
> https://www.postgresql.org/message-id/flat/CAGMVOdsbtRwE_4%2Bv8zjH1d9xfovDeQAGLkP_B6k69_VoFEgX-A%40mail.gmail.com
> 
> and came up with attached POC patch (I used some varibale names
> appearing in the Krasiyan Andreev's patch). I really love to have
> RESPECT/IGNORE NULLS because I believe they are convenient for
> users.

Excellent.  I was thinking about picking my version of this patch up 
again, but I think this might be better than mine.

I am curious why set_mark is false in the IGNORE version instead of also 
being const_offset.  Surely the nth non-null in the frame will never go 
backwards.

Dealing with marks was the main reason (I think) that my patch was not 
accepted.

> For FIRST/LAST I am not so excited since there are alternatives
> as our document stats,

I disagree with this.  The point of having FROM LAST is to avoid 
calculating a new window and running a new pass over it.

 > so FIRST/LAST are not included in the patch.

I do agree that we can have <null treatment> without <from first or 
last> so let's move forward with this and handle the latter later.

> Currently in the patch only nth_value is allowed to use RESPECT/IGNORE
> NULLS.

This should not be hard coded.  It should be a new field in pg_proc 
(with a sanity check that it is only true for window functions).  That 
way custom window functions can implement it.

> I think it's not hard to implement it for others (lead, lag,
> first_value and last_value).

It doesn't seem like it should be, no.

> No document nor test patches are included for now.

I can volunteer to work on these if you want.

> Note that RESPECT/IGNORE are not registered as reserved keywords in
> this patch (but registered as unreserved keywords). I am not sure if
> this is acceptable or not.

For me, this is perfectly okay.  Keep them at the lowest level of 
reservation as possible.
-- 
Vik Fearing




Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Tom Lane
Дата:
Vik Fearing <vik@postgresfriends.org> writes:
> On 4/22/23 14:14, Tatsuo Ishii wrote:
>> Note that RESPECT/IGNORE are not registered as reserved keywords in
>> this patch (but registered as unreserved keywords). I am not sure if
>> this is acceptable or not.

> For me, this is perfectly okay.  Keep them at the lowest level of 
> reservation as possible.

Yeah, keep them unreserved if at all possible.  Any higher reservation
level risks breaking existing applications that might be using these
words as column or function names.

            regards, tom lane



Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Tatsuo Ishii
Дата:
> Excellent.  I was thinking about picking my version of this patch up
> again, but I think this might be better than mine.

Thanks.

> I am curious why set_mark is false in the IGNORE version instead of
> also being const_offset.  Surely the nth non-null in the frame will
> never go backwards.

Initially I thought that too. But when I used const_offset instead of
false. I got an error:

ERROR:  cannot fetch row before WindowObject's mark position

> I do agree that we can have <null treatment> without <from first or
> last> so let's move forward with this and handle the latter later.

Agreed.

>> Currently in the patch only nth_value is allowed to use RESPECT/IGNORE
>> NULLS.
>
> This should not be hard coded.  It should be a new field in pg_proc
> (with a sanity check that it is only true for window functions).  That
> way custom window functions can implement it.

There were some discussions on this in the past.
https://www.postgresql.org/message-id/flat/CAGMVOdsbtRwE_4%2Bv8zjH1d9xfovDeQAGLkP_B6k69_VoFEgX-A%40mail.gmail.com

It seems Tom and Andrew thought that "1.1.2. Change the behavior of
the windowapi in some consistent way" is ambitious. If we follow this
direction, I think each window function should check WindowFunc struct
passed by WinGetWindowFunc (added in my patch) to check whether IGNORE
NULLS can be applied or not in the function. If not, error out. This
way, we don't need to add a new field to pg_proc.

>> No document nor test patches are included for now.
> 
> I can volunteer to work on these if you want.

Thanks! I think you can work on top of the last patch posted by Krasiyan Andreev:
https://www.postgresql.org/message-id/CAN1PwonAnC-KkRyY%2BDtRmxQ8rjdJw%2BgcOsHruLr6EnF7zSMH%3DQ%40mail.gmail.com

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp



Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Tatsuo Ishii
Дата:
> Vik Fearing <vik@postgresfriends.org> writes:
>> On 4/22/23 14:14, Tatsuo Ishii wrote:
>>> Note that RESPECT/IGNORE are not registered as reserved keywords in
>>> this patch (but registered as unreserved keywords). I am not sure if
>>> this is acceptable or not.
> 
>> For me, this is perfectly okay.  Keep them at the lowest level of 
>> reservation as possible.
> 
> Yeah, keep them unreserved if at all possible.  Any higher reservation
> level risks breaking existing applications that might be using these
> words as column or function names.

Agreed.

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp



Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Oliver Ford
Дата:


On Sun, Apr 23, 2023 at 4:29 AM Tatsuo Ishii <ishii@sraoss.co.jp> wrote:
> Vik Fearing <vik@postgresfriends.org> writes:
>
>> For me, this is perfectly okay.  Keep them at the lowest level of
>> reservation as possible.
>
> Yeah, keep them unreserved if at all possible.  Any higher reservation
> level risks breaking existing applications that might be using these
> words as column or function names.

Agreed.
 
Attached is a new version of the code and tests to implement this. There's now no modification to windowfuncs.c or the catalog,
it's only a bool added to FuncCall which if set to true, ignores nulls. It adds IGNORE/RESPECT at the Unreserved, As Label level.

The implementation also aims at better performance over previous versions by not disabling set_mark, and using an array to
track previous non-null positions in SEEK_HEAD or SEEK_CURRENT with Forward (lead, but not lag). The mark is set if a row
is out of frame and further rows can't be in frame (to ensure it works with an exclusion clause).

The attached test patch is mostly the same as in the previous patch set, but it doesn't fail on row_number anymore as the main patch
only rejects aggregate functions. The test patch also adds a test for EXCLUDE CURRENT ROW and for two contiguous null rows.

I've not yet tested custom window functions with the patch, but I'm happy to add them to the test patch in v2 if we want to go this way
in implementing this feature.
Вложения

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Tatsuo Ishii
Дата:
> The attached test patch is mostly the same as in the previous patch
> set, but it doesn't fail on row_number anymore as the main patch
> only rejects aggregate functions. The test patch also adds a test for

> +SELECT sum(orbit) RESPECT NULLS OVER () FROM planets; -- succeeds

I think the standard does not allow to specify RESPECT NULLS other
than lead, lag, first_value, last_value and nth_value. Unless we agree
that PostgreSQL violates the standard in this regard, you should not
allow to use RESPECT NULLS for the window functions, expect lead etc.
and aggregates.

See my patch.

> +/*
> + * Window function option clauses
> + */
> +opt_null_treatment:
> +            RESPECT NULLS_P                            { $$ = RESPECT_NULLS; }
> +            | IGNORE_P NULLS_P                        { $$ = IGNORE_NULLS; }
> +            | /*EMPTY*/                                { $$ = NULL_TREATMENT_NOT_SET; }
> +        ;

With this, you can check if null treatment clause is used or not in
each window function.

In my previous patch I did the check in parse/analysis but I think
it's better to be checked in each window function. This way,

- need not to add a column to pg_proc.

- allow user defined window functions to decide by themselves whether
  they can accept null treatment option.

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp



Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Tatsuo Ishii
Дата:
>> The attached test patch is mostly the same as in the previous patch
>> set, but it doesn't fail on row_number anymore as the main patch
>> only rejects aggregate functions. The test patch also adds a test for
> 
>> +SELECT sum(orbit) RESPECT NULLS OVER () FROM planets; -- succeeds
> 
> I think the standard does not allow to specify RESPECT NULLS other
> than lead, lag, first_value, last_value and nth_value. Unless we agree
> that PostgreSQL violates the standard in this regard, you should not
> allow to use RESPECT NULLS for the window functions, expect lead etc.
> and aggregates.
> 
> See my patch.
> 
>> +/*
>> + * Window function option clauses
>> + */
>> +opt_null_treatment:
>> +            RESPECT NULLS_P                            { $$ = RESPECT_NULLS; }
>> +            | IGNORE_P NULLS_P                        { $$ = IGNORE_NULLS; }
>> +            | /*EMPTY*/                                { $$ = NULL_TREATMENT_NOT_SET; }
>> +        ;
> 
> With this, you can check if null treatment clause is used or not in
> each window function.
> 
> In my previous patch I did the check in parse/analysis but I think
> it's better to be checked in each window function. This way,
> 
> - need not to add a column to pg_proc.
> 
> - allow user defined window functions to decide by themselves whether
>   they can accept null treatment option.

Attached is the patch to implement this (on top of your patch).

test=# SELECT row_number() RESPECT NULLS OVER () FROM (SELECT 1) AS s;
ERROR:  window function row_number cannot have RESPECT NULLS or IGNORE NULLS

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index fac0e05dee..b8519d9890 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -74,6 +74,7 @@ typedef struct WindowObjectData
     int64        *win_nonnulls;    /* tracks non-nulls in ignore nulls mode */
     int            nonnulls_size;    /* track size of the win_nonnulls array */
     int            nonnulls_len;    /* track length of the win_nonnulls array */
+    WindowFunc    *wfunc;            /* WindowFunc of this function */
 } WindowObjectData;
 
 /*
@@ -2634,6 +2635,8 @@ ExecInitWindowAgg(WindowAgg *node, EState *estate, int eflags)
                 winobj->nonnulls_len = 0;
             }
 
+            winobj->wfunc = wfunc;
+
             /* It's a real window function, so set up to call it. */
             fmgr_info_cxt(wfunc->winfnoid, &perfuncstate->flinfo,
                           econtext->ecxt_per_query_memory);
@@ -3881,3 +3884,24 @@ WinGetFuncArgCurrent(WindowObject winobj, int argno, bool *isnull)
     return ExecEvalExpr((ExprState *) list_nth(winobj->argstates, argno),
                         econtext, isnull);
 }
+
+/*
+ * Error out that this window function cannot have null treatement.
+ */
+void
+ErrorOutNullTreatment(WindowObject winobj)
+{
+    char            *fname;
+
+    Assert(WindowObjectIsValid(winobj));
+
+    if (winobj->wfunc->null_treatment == NULL_TREATMENT_NOT_SET)
+        return;
+
+    fname = get_func_name(winobj->wfunc->winfnoid);
+
+    ereport(ERROR,
+            (errcode(ERRCODE_WRONG_OBJECT_TYPE),
+             errmsg("window function %s cannot have RESPECT NULLS or IGNORE NULLS",
+                    fname)));
+}
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index 01fd16acf9..05e64c4569 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -2475,6 +2475,7 @@ eval_const_expressions_mutator(Node *node,
                 newexpr->winstar = expr->winstar;
                 newexpr->winagg = expr->winagg;
                 newexpr->ignore_nulls = expr->ignore_nulls;
+                newexpr->null_treatment = expr->null_treatment;
                 newexpr->location = expr->location;
 
                 return (Node *) newexpr;
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 58c00bfd4f..e131428e85 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -276,6 +276,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
     MergeWhenClause *mergewhen;
     struct KeyActions *keyactions;
     struct KeyAction *keyaction;
+    NullTreatment    nulltreatment;
 }
 
 %type <node>    stmt toplevel_stmt schema_stmt routine_body_stmt
@@ -633,7 +634,6 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
                 opt_frame_clause frame_extent frame_bound
 %type <ival>    opt_window_exclusion_clause
 %type <str>        opt_existing_window_name
-%type <boolean> null_treatment
 %type <boolean> opt_if_not_exists
 %type <boolean> opt_unique_null_treatment
 %type <ival>    generated_when override_kind
@@ -662,6 +662,8 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
                     json_object_constructor_null_clause_opt
                     json_array_constructor_null_clause_opt
 
+%type <nulltreatment>        null_treatment
+
 /*
  * Non-keyword token types.  These are hard-wired into the "flex" lexer.
  * They must be listed first so that their numeric codes do not depend on
@@ -15247,7 +15249,7 @@ func_expr: func_application within_group_clause filter_clause null_treatment ove
                         n->agg_within_group = true;
                     }
                     n->agg_filter = $3;
-                    n->ignore_nulls = $4;
+                    n->null_treatment = $4;
                     n->over = $5;
                     $$ = (Node *) n;
                 }
@@ -15797,9 +15799,9 @@ filter_clause:
  * Window Definitions
  */
 null_treatment:
-            IGNORE_P NULLS_P                        { $$ = true; }
-            | RESPECT_P NULLS_P                        { $$ = false; }
-            | /*EMPTY*/                                { $$ = false; }
+            RESPECT_P NULLS_P                        { $$ = RESPECT_NULLS; }
+            | IGNORE_P NULLS_P                        { $$ = IGNORE_NULLS; }
+            | /*EMPTY*/                                { $$ = NULL_TREATMENT_NOT_SET; }
         ;
 
 window_clause:
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index afa4bcc8d1..63af8ca6aa 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -98,7 +98,7 @@ ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs,
     bool        agg_star = (fn ? fn->agg_star : false);
     bool        agg_distinct = (fn ? fn->agg_distinct : false);
     bool        func_variadic = (fn ? fn->func_variadic : false);
-    bool        ignore_nulls = (fn ? fn->ignore_nulls : false);
+    NullTreatment null_treatment = (fn ? fn->null_treatment : NULL_TREATMENT_NOT_SET);
     CoercionForm funcformat = (fn ? fn->funcformat : COERCE_EXPLICIT_CALL);
     bool        could_be_projection;
     Oid            rettype;
@@ -516,11 +516,12 @@ ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs,
                                 NameListToString(funcname)),
                          parser_errposition(pstate, location)));
 
-            /* It also can't treat nulls as a window function */
-            if (ignore_nulls)
+            /* Aggregate functions cannot have null treatment clause */
+            if (null_treatment != NULL_TREATMENT_NOT_SET)
                 ereport(ERROR,
                         (errcode(ERRCODE_WRONG_OBJECT_TYPE),
-                         errmsg("aggregate functions do not accept RESPECT/IGNORE NULLS"),
+                         errmsg("aggregate function %s cannot have RESPECT NULLS or IGNORE NULLS",
+                                NameListToString(funcname)),
                          parser_errposition(pstate, location)));
         }
     }
@@ -842,7 +843,8 @@ ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs,
         wfunc->winstar = agg_star;
         wfunc->winagg = (fdresult == FUNCDETAIL_AGGREGATE);
         wfunc->aggfilter = agg_filter;
-        wfunc->ignore_nulls = ignore_nulls;
+        wfunc->null_treatment = null_treatment;
+        wfunc->ignore_nulls = (null_treatment == IGNORE_NULLS);
         wfunc->location = location;
 
         /*
diff --git a/src/backend/utils/adt/windowfuncs.c b/src/backend/utils/adt/windowfuncs.c
index b87a624fb2..297e787927 100644
--- a/src/backend/utils/adt/windowfuncs.c
+++ b/src/backend/utils/adt/windowfuncs.c
@@ -85,6 +85,9 @@ window_row_number(PG_FUNCTION_ARGS)
     WindowObject winobj = PG_WINDOW_OBJECT();
     int64        curpos = WinGetCurrentPosition(winobj);
 
+    /* row_number() does not support null treatment */
+    ErrorOutNullTreatment(winobj);
+
     WinSetMarkPosition(winobj, curpos);
     PG_RETURN_INT64(curpos + 1);
 }
@@ -140,6 +143,9 @@ window_rank(PG_FUNCTION_ARGS)
     rank_context *context;
     bool        up;
 
+    /* rank() does not support null treatment */
+    ErrorOutNullTreatment(winobj);
+
     up = rank_up(winobj);
     context = (rank_context *)
         WinGetPartitionLocalMemory(winobj, sizeof(rank_context));
@@ -202,6 +208,9 @@ window_dense_rank(PG_FUNCTION_ARGS)
     rank_context *context;
     bool        up;
 
+    /* dense_rank() does not support null treatment */
+    ErrorOutNullTreatment(winobj);
+
     up = rank_up(winobj);
     context = (rank_context *)
         WinGetPartitionLocalMemory(winobj, sizeof(rank_context));
@@ -266,6 +275,9 @@ window_percent_rank(PG_FUNCTION_ARGS)
 
     Assert(totalrows > 0);
 
+    /* percent_rank() does not support null treatment */
+    ErrorOutNullTreatment(winobj);
+
     up = rank_up(winobj);
     context = (rank_context *)
         WinGetPartitionLocalMemory(winobj, sizeof(rank_context));
@@ -335,6 +347,9 @@ window_cume_dist(PG_FUNCTION_ARGS)
 
     Assert(totalrows > 0);
 
+    /* cume_dist() does not support null treatment */
+    ErrorOutNullTreatment(winobj);
+
     up = rank_up(winobj);
     context = (rank_context *)
         WinGetPartitionLocalMemory(winobj, sizeof(rank_context));
@@ -412,6 +427,9 @@ window_ntile(PG_FUNCTION_ARGS)
     WindowObject winobj = PG_WINDOW_OBJECT();
     ntile_context *context;
 
+    /* ntile() does not support null treatment */
+    ErrorOutNullTreatment(winobj);
+
     context = (ntile_context *)
         WinGetPartitionLocalMemory(winobj, sizeof(ntile_context));
 
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index 073e2469ba..32fbab46a0 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -432,6 +432,7 @@ typedef struct FuncCall
     bool        agg_distinct;    /* arguments were labeled DISTINCT */
     bool        func_variadic;    /* last argument was labeled VARIADIC */
     CoercionForm funcformat;    /* how to display this node */
+    NullTreatment null_treatment;    /* RESPECT_NULLS or IGNORE NULLS */
     int            location;        /* token location, or -1 if unknown */
 } FuncCall;
 
diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h
index 221b5e6218..545b5e5ac8 100644
--- a/src/include/nodes/primnodes.h
+++ b/src/include/nodes/primnodes.h
@@ -532,6 +532,13 @@ typedef struct GroupingFunc
     int            location;
 } GroupingFunc;
 
+typedef enum NullTreatment
+{
+    NULL_TREATMENT_NOT_SET = 0,
+    RESPECT_NULLS,
+    IGNORE_NULLS
+} NullTreatment;
+
 /*
  * WindowFunc
  *
@@ -559,7 +566,8 @@ typedef struct WindowFunc
     bool        winstar pg_node_attr(query_jumble_ignore);
     /* is function a simple aggregate? */
     bool        winagg pg_node_attr(query_jumble_ignore);
-    /* ignore nulls */
+    /* null treatement */
+    NullTreatment null_treatment pg_node_attr(query_jumble_ignore);
     bool        ignore_nulls;
     /* token location, or -1 if unknown */
     int            location;
diff --git a/src/include/windowapi.h b/src/include/windowapi.h
index b8c2c565d1..8a50478ee9 100644
--- a/src/include/windowapi.h
+++ b/src/include/windowapi.h
@@ -61,4 +61,6 @@ extern Datum WinGetFuncArgInFrame(WindowObject winobj, int argno,
 extern Datum WinGetFuncArgCurrent(WindowObject winobj, int argno,
                                   bool *isnull);
 
+extern void    ErrorOutNullTreatment(WindowObject winobj);
+
 #endif                            /* WINDOWAPI_H */

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Oliver Ford
Дата:


On Sat, 6 May 2023, 04:57 Tatsuo Ishii, <ishii@sraoss.co.jp> wrote:
Attached is the patch to implement this (on top of your patch).

test=# SELECT row_number() RESPECT NULLS OVER () FROM (SELECT 1) AS s;
ERROR:  window function row_number cannot have RESPECT NULLS or IGNORE NULLS

The last time this was discussed (https://www.postgresql.org/message-id/1037735.1610402426%40sss.pgh.pa.us) it was suggested to make the feature generalizable, beyond what the standard says it should be limited to.

With it generalizable, there would need to be extra checks for custom functions, such as if they allow multiple column arguments (which I'll add in v2 of the patch if the design's accepted).

So I think we need a consensus on whether to stick to limiting it to several specific functions, or making it generalized yet agreeing the rules to limit it (such as no agg functions, and no functions with multiple column arguments).


Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Tatsuo Ishii
Дата:
> The last time this was discussed (
> https://www.postgresql.org/message-id/1037735.1610402426%40sss.pgh.pa.us)
> it was suggested to make the feature generalizable, beyond what the
> standard says it should be limited to.

I have read the mail. In my understanding nobody said that standard
window functions should all accept the null treatment clause.

Also Tom said:
https://www.postgresql.org/message-id/5567.1537884439%40sss.pgh.pa.us
> The questions of how we interface to the individual window functions
> are really independent of how we handle the parsing problem.  My
> first inclination is to just pass the flags down to the window functions
> (store them in WindowObject and provide some additional inquiry functions
> in windowapi.h) and let them deal with it.

As I said before I totally agree with this. With my patch if a
(custom) window function does not want to accept null treatment
clause, it just calls ErrorOutNullTreatment(). It will raise an error
if IGNORE NULLS or RESPECT NULLS is provided. If it does call the
function, it is up to the function how to deal with the null
treatment. In another word, the infrastructure does not have fixed
rules to allow/disallow null treatment clause for each window
function. It's "delegated" to each window function.

Anyway we can change the rule for other than nth_value etc. later
easily once my patch is brought in.

> With it generalizable, there would need to be extra checks for custom
> functions, such as if they allow multiple column arguments (which I'll add
> in v2 of the patch if the design's accepted).

I am not sure if allowing-multiple-column-arguments patch should be
provided with null-treatment patch.

> So I think we need a consensus on whether to stick to limiting it to
> several specific functions, or making it generalized yet agreeing the rules
> to limit it (such as no agg functions, and no functions with multiple
> column arguments).

Let's see the discussion...

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp



Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Vik Fearing
Дата:
On 9/7/24 22:25, Oliver Ford wrote:
> On Sat, May 6, 2023 at 9:41 AM Oliver Ford <ojford@gmail.com> wrote:
>>
>>
>>
>> On Sat, 6 May 2023, 04:57 Tatsuo Ishii, <ishii@sraoss.co.jp> wrote:
>>>
>>> Attached is the patch to implement this (on top of your patch).
>>>
>>> test=# SELECT row_number() RESPECT NULLS OVER () FROM (SELECT 1) AS s;
>>> ERROR:  window function row_number cannot have RESPECT NULLS or IGNORE NULLS
>>
>>
>> The last time this was discussed (https://www.postgresql.org/message-id/1037735.1610402426%40sss.pgh.pa.us) it was
suggestedto make the feature generalizable, beyond what the standard says it should be limited to.
 
>>
>> With it generalizable, there would need to be extra checks for custom functions, such as if they allow multiple
columnarguments (which I'll add in v2 of the patch if the design's accepted).
 
>>
>> So I think we need a consensus on whether to stick to limiting it to several specific functions, or making it
generalizedyet agreeing the rules to limit it (such as no agg functions, and no functions with multiple column
arguments).
> 
> Reviving this thread, I've attached a rebased patch with code, docs,
> and tests and added it to November commitfest.

Excellent!  One of these days we'll get this in. :-)

I have a problem with this test, though:

     SELECT sum(orbit) RESPECT NULLS OVER () FROM planets; -- succeeds

Why should that succeed?  Especially since aggregates such as SUM() will 
ignore nulls!  The error message on its partner seems to confirm this:

     SELECT sum(orbit) IGNORE NULLS OVER () FROM planets; -- fails
     ERROR:  aggregate functions do not accept RESPECT/IGNORE NULLS

I believe they should both fail.
-- 
Vik Fearing




Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Tatsuo Ishii
Дата:
>> >> On Sat, 6 May 2023, 04:57 Tatsuo Ishii, <ishii@sraoss.co.jp> wrote:
>> >>>
>> >>> Attached is the patch to implement this (on top of your patch).
>> >>>
>> >>> test=# SELECT row_number() RESPECT NULLS OVER () FROM (SELECT 1) AS s;
>> >>> ERROR:  window function row_number cannot have RESPECT NULLS or IGNORE NULLS
>> >>
>> >>
>> >> The last time this was discussed (https://www.postgresql.org/message-id/1037735.1610402426%40sss.pgh.pa.us) it
wassuggested to make the feature generalizable, beyond what the standard says it should be limited to.
 
>> >>
>> >> With it generalizable, there would need to be extra checks for custom functions, such as if they allow multiple
columnarguments (which I'll add in v2 of the patch if the design's accepted).
 
>> >>
>> >> So I think we need a consensus on whether to stick to limiting it to several specific functions, or making it
generalizedyet agreeing the rules to limit it (such as no agg functions, and no functions with multiple column
arguments).

It seems you allow to use IGNORE NULLS for all window functions. If
the case, you should explicitely stat that in the docs. Otherwise
users will be confused because;

1) The SQL standard says IGNORE NULLS only for lead, lag, first_value,
last_value and nth_value.

2) Some window function returns same rows with IGNORE NULLS/RESPECT
NULLS. Consider following case.

test=# create table t1(i int);
CREATE TABLE
test=# insert into t1 values(NULL),(NULL);
INSERT 0 2
test=# select * from t1;
 i 
---
  
  
(2 rows)

test=# SELECT row_number() IGNORE NULLS OVER w FROM t1 WINDOW w AS (ORDER BY i);
 row_number 
------------
          1
          2
(2 rows)

The t1 table only contains NULL rows. By using IGNORE NULLS, I think
it's no wonder that a user expects 0 rows returned, if there's no
mention in the docs that actually IGNORE NULLS/RESPECT NULLS are just
ignored in some window functions.

Instead I think it's better that other than lead, lag, first_value,
last_value and nth_value each window function errors out if IGNORE
NULLS/RESPECT NULL are passed to these window functions.

I take a look at the patch and noticed that following functions have
no comments on what they are doing and what are the arguments. Please
look into other functions in nodeWindowAgg.c and add appropriate
comments to those functions.

+static void increment_notnulls(WindowObject winobj, int64 pos)

+static Datum ignorenulls_getfuncarginpartition(WindowObject winobj, int argno,
+                        int relpos, int seektype, bool set_mark, bool *isnull, bool *isout) {

+static Datum ignorenulls_getfuncarginframe(WindowObject winobj, int argno,
+                        int relpos, int seektype, bool set_mark, bool *isnull, bool *isout) {

Also the coding style does not fit into our coding standard. They should be written something like:

static void
increment_notnulls(WindowObject winobj, int64 pos)

static Datum
ignorenulls_getfuncarginpartition(WindowObject winobj, int argno,
                        int relpos, int seektype, bool set_mark, bool *isnull, bool *isout)
{

static Datum
ignorenulls_getfuncarginframe(WindowObject winobj, int argno,
                        int relpos, int seektype, bool set_mark, bool *isnull, bool *isout)
{

See also:
https://www.postgresql.org/docs/current/source-format.html

+    int            ignore_nulls;    /* ignore nulls */

You should add more comment here. I.e. what values are possible for
ignore_nulls.

I also notice that you have an array in memory which records non-null
row positions in a partition. The position is represented in int64,
which means 1 entry consumes 8 bytes. If my understanding is correct,
the array continues to grow up to the partition size. Also the array
is created for each window function (is it really necessary?). I worry
about this because it might consume excessive memory for big
partitions.

Best reagards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp



Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
"David G. Johnston"
Дата:
On Wednesday, September 11, 2024, Tatsuo Ishii <ishii@postgresql.org> wrote:

test=# SELECT row_number() IGNORE NULLS OVER w FROM t1 WINDOW w AS (ORDER BY i);
 row_number
------------
          1
          2
(2 rows)

The t1 table only contains NULL rows. By using IGNORE NULLS, I think
it's no wonder that a user expects 0 rows returned, if there's no
mention in the docs that actually IGNORE NULLS/RESPECT NULLS are just
ignored in some window functions.

My nieve understanding of the nulls treatment is computations are affected, therefore a zero-argument function is incapable of abiding by this clause (it should error…).  Your claim that this should somehow produce zero rows confuses me on two fronts.  One, window function should be incapable of affecting how many rows are returned.  The query must output two rows regardless of the result of the window expression (it should at worse produce the null value).  Two, to produce said null value you have to be ignoring the row due to the order by clause seeing a null.  But the order by isn’t part of the computation.

David J.

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Tatsuo Ishii
Дата:
> On Wednesday, September 11, 2024, Tatsuo Ishii <ishii@postgresql.org> wrote:
> 
>>
>> test=# SELECT row_number() IGNORE NULLS OVER w FROM t1 WINDOW w AS (ORDER
>> BY i);
>>  row_number
>> ------------
>>           1
>>           2
>> (2 rows)
>>
>> The t1 table only contains NULL rows. By using IGNORE NULLS, I think
>> it's no wonder that a user expects 0 rows returned, if there's no
>> mention in the docs that actually IGNORE NULLS/RESPECT NULLS are just
>> ignored in some window functions.
>>
> 
> My nieve understanding of the nulls treatment is computations are affected,
> therefore a zero-argument function is incapable of abiding by this clause
> (it should error…).

Yes. I actually claimed that row_number() should error out if the
clause is provided.

> Instead I think it's better that other than lead, lag, first_value,
> last_value and nth_value each window function errors out if IGNORE
> NULLS/RESPECT NULL are passed to these window functions.

> Your claim that this should somehow produce zero rows
> confuses me on two fronts.  One, window function should be incapable of
> affecting how many rows are returned.  The query must output two rows
> regardless of the result of the window expression (it should at worse
> produce the null value).  Two, to produce said null value you have to be
> ignoring the row due to the order by clause seeing a null.  But the order
> by isn’t part of the computation.

Well I did not claim that. I just gave a possible example what users
could misunderstand. Probably my example was not so good.

Best reagards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp



Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Tatsuo Ishii
Дата:
Thanks for updating the patch.

>> It seems you allow to use IGNORE NULLS for all window functions. If
>> the case, you should explicitely stat that in the docs. Otherwise
>> users will be confused because;
> 
> The latest version restricts it to lag, lead, first_value, last_value,
> and nth_value. We can extend it in a subsequent patch if there's
> demand?

The restriction is required by the SQL standard. So I don't think we
need to extend to other window functions.

>> I take a look at the patch and noticed that following functions have
>> no comments on what they are doing and what are the arguments. Please
>> look into other functions in nodeWindowAgg.c and add appropriate
>> comments to those functions.
> 
> Latest version has more comments and should be in the standard coding style.

Still I see non standard coding stiles and indentations. See attached
patch for nodeWindowAgg.c, which is fixed by pgindent, for
example. (Other files may need fixing too).

>> I also notice that you have an array in memory which records non-null
>> row positions in a partition. The position is represented in int64,
>> which means 1 entry consumes 8 bytes. If my understanding is correct,
>> the array continues to grow up to the partition size. Also the array
>> is created for each window function (is it really necessary?). I worry
>> about this because it might consume excessive memory for big
>> partitions.
> 
> It's an int64 because it stores the abs_pos/mark_pos which are int64.
> Keeping an array for each function is needed for the mark optimization
> to work correctly.

Ok.

Here are some comments regarding the patch:

(1) I noticed that ignorenulls_getfuncarginframe() does not take
account EXCLUSION frame options. The code path is in
WinGetFuncArgInFrame():

            /*
             * Account for exclusion option if one is active, but advance only
             * abs_pos not mark_pos.  This prevents changes of the current
             * row's peer group from resulting in trying to fetch a row before
             * some previous mark position.
:
:

I guess ignorenulls_getfuncarginframe() was created by modifying
WinGetFuncArgInFrame() so I don't see the reason why
ignorenulls_getfuncarginframe() does not take account EXCLUSION frame
options.

(2) New member ignore_nulls are added to some structs. Its value is 0,
1 or -1. It's better to use a DEFINE for the value of ignore_nulls,
rather than 0, 1, or -1.

Best reagards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index e1117857dc0..520e7e1bfcb 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -2624,7 +2624,10 @@ ExecInitWindowAgg(WindowAgg *node, EState *estate, int eflags)
             elog(ERROR, "WindowFunc with winref %u assigned to WindowAgg with winref %u",
                  wfunc->winref, node->winref);
 
-        /* Look for a previous duplicate window function, which needs the same ignore_nulls value */
+        /*
+         * Look for a previous duplicate window function, which needs the same
+         * ignore_nulls value
+         */
         for (i = 0; i <= wfuncno; i++)
         {
             if (equal(wfunc, perfunc[i].wfunc) &&
@@ -3379,8 +3382,8 @@ increment_nonnulls(WindowObject winobj, int64 pos)
         winobj->nonnulls_size *= 2;
         winobj->win_nonnulls =
             repalloc_array(winobj->win_nonnulls,
-                            int64,
-                            winobj->nonnulls_size);
+                           int64,
+                           winobj->nonnulls_size);
     }
     winobj->win_nonnulls[winobj->nonnulls_len] = pos;
     winobj->nonnulls_len++;
@@ -3394,7 +3397,8 @@ increment_nonnulls(WindowObject winobj, int64 pos)
 static Datum
 ignorenulls_getfuncarginpartition(WindowObject winobj, int argno,
                                   int relpos, int seektype, bool set_mark,
-                                  bool *isnull, bool *isout) {
+                                  bool *isnull, bool *isout)
+{
     WindowAggState *winstate;
     ExprContext *econtext;
     TupleTableSlot *slot;
@@ -3416,27 +3420,27 @@ ignorenulls_getfuncarginpartition(WindowObject winobj, int argno,
 
     switch (seektype)
     {
-    case WINDOW_SEEK_CURRENT:
-        abs_pos = winstate->currentpos;
-        break;
-    case WINDOW_SEEK_HEAD:
-        abs_pos = 0;
-        break;
-    case WINDOW_SEEK_TAIL:
-        spool_tuples(winstate, -1);
-        abs_pos = winstate->spooled_rows - 1;
-        break;
-    default:
-        elog(ERROR, "unrecognized window seek type: %d", seektype);
-        abs_pos = 0; /* keep compiler quiet */
-        break;
+        case WINDOW_SEEK_CURRENT:
+            abs_pos = winstate->currentpos;
+            break;
+        case WINDOW_SEEK_HEAD:
+            abs_pos = 0;
+            break;
+        case WINDOW_SEEK_TAIL:
+            spool_tuples(winstate, -1);
+            abs_pos = winstate->spooled_rows - 1;
+            break;
+        default:
+            elog(ERROR, "unrecognized window seek type: %d", seektype);
+            abs_pos = 0;        /* keep compiler quiet */
+            break;
     }
 
     if (forward == -1)
         goto check_partition;
 
     /* if we're moving forward, store previous rows */
-    for (i=0; i < winobj->nonnulls_len; ++i)
+    for (i = 0; i < winobj->nonnulls_len; ++i)
     {
         if (winobj->win_nonnulls[i] > abs_pos)
         {
@@ -3448,7 +3452,7 @@ ignorenulls_getfuncarginpartition(WindowObject winobj, int argno,
                     *isout = false;
                 window_gettupleslot(winobj, abs_pos, slot);
                 econtext->ecxt_outertuple = slot;
-                return ExecEvalExpr((ExprState *)list_nth(winobj->argstates, argno),
+                return ExecEvalExpr((ExprState *) list_nth(winobj->argstates, argno),
                                     econtext, isnull);
             }
         }
@@ -3465,13 +3469,13 @@ check_partition:
             if (isout)
                 *isout = true;
             *isnull = true;
-            return (Datum)0;
+            return (Datum) 0;
         }
 
         if (isout)
             *isout = false;
         econtext->ecxt_outertuple = slot;
-        datum = ExecEvalExpr((ExprState *)list_nth(winobj->argstates, argno),
+        datum = ExecEvalExpr((ExprState *) list_nth(winobj->argstates, argno),
                              econtext, isnull);
 
         if (!*isnull)
@@ -3494,7 +3498,8 @@ check_partition:
 static Datum
 ignorenulls_getfuncarginframe(WindowObject winobj, int argno,
                               int relpos, int seektype, bool set_mark,
-                              bool *isnull, bool *isout) {
+                              bool *isnull, bool *isout)
+{
     WindowAggState *winstate;
     ExprContext *econtext;
     TupleTableSlot *slot;
@@ -3511,7 +3516,7 @@ ignorenulls_getfuncarginframe(WindowObject winobj, int argno,
     winstate = winobj->winstate;
     econtext = winstate->ss.ps.ps_ExprContext;
     slot = winstate->temp_slot_1;
-    datum = (Datum)0;
+    datum = (Datum) 0;
     notnull_offset = 0;
     notnull_relpos = abs(relpos);
 
@@ -3549,70 +3554,72 @@ ignorenulls_getfuncarginframe(WindowObject winobj, int argno,
      */
     for (i = 0; i < winobj->nonnulls_len; ++i)
     {
-            int inframe;
-            if (winobj->win_nonnulls[i] < winobj->markpos)
-                continue;
-            if (!window_gettupleslot(winobj, winobj->win_nonnulls[i], slot))
-                continue;
+        int            inframe;
 
-            inframe = row_is_in_frame(winstate, winobj->win_nonnulls[i], slot);
-            if (inframe <= 0)
-            {
-                if (inframe == -1 && set_mark)
-                    WinSetMarkPosition(winobj, winobj->win_nonnulls[i]);
-                continue;
-            }
+        if (winobj->win_nonnulls[i] < winobj->markpos)
+            continue;
+        if (!window_gettupleslot(winobj, winobj->win_nonnulls[i], slot))
+            continue;
 
-            abs_pos = winobj->win_nonnulls[i] + 1;
-            ++notnull_offset;
+        inframe = row_is_in_frame(winstate, winobj->win_nonnulls[i], slot);
+        if (inframe <= 0)
+        {
+            if (inframe == -1 && set_mark)
+                WinSetMarkPosition(winobj, winobj->win_nonnulls[i]);
+            continue;
+        }
 
-            if (notnull_offset > notnull_relpos)
-            {
-                if (isout)
+        abs_pos = winobj->win_nonnulls[i] + 1;
+        ++notnull_offset;
+
+        if (notnull_offset > notnull_relpos)
+        {
+            if (isout)
                 *isout = false;
-                econtext->ecxt_outertuple = slot;
-                return ExecEvalExpr((ExprState *)list_nth(winobj->argstates, argno),
-                                    econtext, isnull);
-            }
+            econtext->ecxt_outertuple = slot;
+            return ExecEvalExpr((ExprState *) list_nth(winobj->argstates, argno),
+                                econtext, isnull);
+        }
     }
 
 check_frame:
     do
     {
-            int inframe;
-            if (!window_gettupleslot(winobj, abs_pos, slot))
-                goto out_of_frame;
+        int            inframe;
 
-            inframe = row_is_in_frame(winstate, abs_pos, slot);
-            if (inframe == -1)
-                goto out_of_frame;
-            else if (inframe == 0)
-                goto advance;
+        if (!window_gettupleslot(winobj, abs_pos, slot))
+            goto out_of_frame;
 
-            gottuple = window_gettupleslot(winobj, abs_pos, slot);
+        inframe = row_is_in_frame(winstate, abs_pos, slot);
+        if (inframe == -1)
+            goto out_of_frame;
+        else if (inframe == 0)
+            goto advance;
 
-            if (!gottuple)
-            {
-                if (isout)
-                    *isout = true;
-                *isnull = true;
-                return (Datum)0;
-            }
+        gottuple = window_gettupleslot(winobj, abs_pos, slot);
 
+        if (!gottuple)
+        {
             if (isout)
-                *isout = false;
-            econtext->ecxt_outertuple = slot;
-            datum = ExecEvalExpr((ExprState *)list_nth(winobj->argstates, argno),
-                                 econtext, isnull);
+                *isout = true;
+            *isnull = true;
+            return (Datum) 0;
+        }
 
-            if (!*isnull)
-            {
-                ++notnull_offset;
-                increment_nonnulls(winobj, abs_pos);
-            }
+        if (isout)
+            *isout = false;
+        econtext->ecxt_outertuple = slot;
+        datum = ExecEvalExpr((ExprState *) list_nth(winobj->argstates, argno),
+                             econtext, isnull);
+
+        if (!*isnull)
+        {
+            ++notnull_offset;
+            increment_nonnulls(winobj, abs_pos);
+        }
 
 advance:
-            abs_pos += forward;
+        abs_pos += forward;
     } while (notnull_offset <= notnull_relpos);
 
     return datum;
@@ -3660,7 +3667,7 @@ WinGetFuncArgInPartition(WindowObject winobj, int argno,
 
     if (winobj->ignore_nulls == 1 && relpos != 0)
         return ignorenulls_getfuncarginpartition(winobj, argno, relpos, seektype,
-                                                    set_mark, isnull, isout);
+                                                 set_mark, isnull, isout);
 
     switch (seektype)
     {
@@ -3752,7 +3759,7 @@ WinGetFuncArgInFrame(WindowObject winobj, int argno,
 
     if (winobj->ignore_nulls == 1)
         return ignorenulls_getfuncarginframe(winobj, argno, relpos, seektype,
-                                                set_mark, isnull, isout);
+                                             set_mark, isnull, isout);
 
     switch (seektype)
     {

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Tom Lane
Дата:
Tatsuo Ishii <ishii@postgresql.org> writes:
>> The latest version restricts it to lag, lead, first_value, last_value,
>> and nth_value. We can extend it in a subsequent patch if there's
>> demand?

> The restriction is required by the SQL standard. So I don't think we
> need to extend to other window functions.

The SQL spec does not believe that user-defined window functions are a
thing.  So its opinion on this point is useless.  I would think that
IGNORE NULLS is potentially useful for user-defined window functions,
and we should not be building anything that restricts the feature to
specific functions.

            regards, tom lane



Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Tatsuo Ishii
Дата:
> Tatsuo Ishii <ishii@postgresql.org> writes:
>>> The latest version restricts it to lag, lead, first_value, last_value,
>>> and nth_value. We can extend it in a subsequent patch if there's
>>> demand?
> 
>> The restriction is required by the SQL standard. So I don't think we
>> need to extend to other window functions.
> 
> The SQL spec does not believe that user-defined window functions are a
> thing.  So its opinion on this point is useless.

Of course the standard does not mention anything about the user
defined window functions and the restriction is not apply to the user
defined window functions.

> I would think that
> IGNORE NULLS is potentially useful for user-defined window functions,
> and we should not be building anything that restricts the feature to
> specific functions.

So you want to allow to use IGNORE NULLS to other built-in window
functions?
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp



Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Tom Lane
Дата:
Tatsuo Ishii <ishii@postgresql.org> writes:
>> I would think that
>> IGNORE NULLS is potentially useful for user-defined window functions,
>> and we should not be building anything that restricts the feature to
>> specific functions.

> So you want to allow to use IGNORE NULLS to other built-in window
> functions?

No, there needs to be a way for the individual window function to
throw error if that's specified for a function that can't handle it.
I'm just saying I don't want that to be hard-wired in some centralized
spot.

            regards, tom lane



Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Tatsuo Ishii
Дата:
>> So you want to allow to use IGNORE NULLS to other built-in window
>> functions?
> 
> No, there needs to be a way for the individual window function to
> throw error if that's specified for a function that can't handle it.
> I'm just saying I don't want that to be hard-wired in some centralized
> spot.

I agree. That's the right direction.
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp



Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Oliver Ford
Дата:
On Mon, Jan 20, 2025 at 12:31 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Tatsuo Ishii <ishii@postgresql.org> writes:
> >> I would think that
> >> IGNORE NULLS is potentially useful for user-defined window functions,
> >> and we should not be building anything that restricts the feature to
> >> specific functions.
>
> > So you want to allow to use IGNORE NULLS to other built-in window
> > functions?
>
> No, there needs to be a way for the individual window function to
> throw error if that's specified for a function that can't handle it.
> I'm just saying I don't want that to be hard-wired in some centralized
> spot.

Would it be acceptable to add a bool column to pg_proc, say
"pronulltreatment"? It would default to false, and an error would be
thrown if the null clause is specified for a function where it's set
to false?



Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Tatsuo Ishii
Дата:
>> No, there needs to be a way for the individual window function to
>> throw error if that's specified for a function that can't handle it.
>> I'm just saying I don't want that to be hard-wired in some centralized
>> spot.
> 
> Would it be acceptable to add a bool column to pg_proc, say
> "pronulltreatment"? It would default to false, and an error would be
> thrown if the null clause is specified for a function where it's set
> to false?

It needs lots of work including modifying CREATE FUNCTION
command. Instead you could add an API to WinObject access functions to
export ignore_nulls value. Then let each window function check it. If
the window function should not take IGNORE/RESPECT NULLS option, throw
an error.

Best reagards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp



Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Tom Lane
Дата:
Tatsuo Ishii <ishii@postgresql.org> writes:
> It needs lots of work including modifying CREATE FUNCTION
> command. Instead you could add an API to WinObject access functions to
> export ignore_nulls value. Then let each window function check it. If
> the window function should not take IGNORE/RESPECT NULLS option, throw
> an error.

Yeah, that would be my first thought too.  The only question is
whether a function that fails to check that could crash.  If it
merely gives surprising answers, I think this way is fine.

            regards, tom lane



Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Tatsuo Ishii
Дата:
> Attached version moves the setting of IGNORE_NULLS to the window
> function itself, with the functions that don't allow it erroring out.
> This is done with a new api: WinCheckAndInitializeNullTreatment.
> 
> Custom functions that don't call this will simply not have the
> IGNORE_NULLS option set as this api initializes the option and the
> array. As per the previous discussion, it should have correct
> formatting and handle the Exclusion clauses correctly.

I played with the v4 patch. It seems lead() produces incorrect result:

test=# SELECT x,y,lead(y) IGNORE NULLS OVER (ORDER BY x) FROM (VALUES(1,NULL),(2,2),(3,NULL)) AS v(x,y);
 x | y | lead 
---+---+------
 1 |   |    2
 2 | 2 |    2
 3 |   |    2
(3 rows)

I think correct result of "lead" column is 2, NULL, NULL.

Best reagards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp



Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Tatsuo Ishii
Дата:
> Hello,
> I also played with the v4 patch and it produces correct result:
> test=# SELECT x,y,lead(y) IGNORE NULLS OVER (ORDER BY x) FROM
> (VALUES(1,NULL),(2,2),(3,NULL)) AS v(x,y);
>  x | y | lead
> ---+---+------
>  1 |   |    2
>  2 | 2 |
>  3 |   |
> (3 rows)
> 
> test=#
> It is from today's git, clean compile and install with only v4 patch
> applied, make check also passes without errors.

I guess you are just lucky. In my case I enabled --enable-cassert to
build PostgreSQL and it automatically turn on CLOBBER_FREED_MEMORY and
freed memory area is scrambled. If I look the patch closer, I found a
problem:

+void
+WinCheckAndInitializeNullTreatment(WindowObject winobj,
:
:
+        winobj->win_nonnulls = palloc_array(int64, 16);

WinCheckAndInitializeNullTreatment is called in each built-in window
function. Window functions are called in the per tuple memory context,
which means win_nonnulls disappears when next tuple is supplied to the
window function. If my understanding is correct, winobj->win_nonnulls
needs to survive across processing tuples.

Best reagards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp



Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Krasiyan Andreev
Дата:
Hi,
I was able to reproduce exactly the problem, with clean compile and --enable-cassert:
test=# SELECT x,y,lead(y) IGNORE NULLS OVER (ORDER BY x) FROM (VALUES(1,NULL),(2,2),(3,NULL)) AS v(x,y);
 x | y | lead
---+---+------
 1 |   |    2
 2 | 2 |    2
 3 |   |    2
(3 rows)

test=#
Also, make check errors out at window test (without --enable-cassert it was passed in previous compile):
krasiyan@fedora:~/pgsql-src/postgresql$ cat /home/krasiyan/pgsql-src/postgresql/src/test/regress/regression.diffs
diff -U3 /home/krasiyan/pgsql-src/postgresql/src/test/regress/expected/window.out /home/krasiyan/pgsql-src/postgresql/src/test/regress/results/window.out
--- /home/krasiyan/pgsql-src/postgresql/src/test/regress/expected/window.out 2025-01-22 21:25:47.114508215 +0200
+++ /home/krasiyan/pgsql-src/postgresql/src/test/regress/results/window.out 2025-01-23 07:58:26.784659592 +0200
@@ -5477,12 +5477,12 @@
   name   | orbit | lead  | lead_respect | lead_ignore
 ---------+-------+-------+--------------+-------------
  earth   |       |  4332 |         4332 |        4332
- jupiter |  4332 |       |              |          88
+ jupiter |  4332 |       |              |            
  mars    |       |    88 |           88 |          88
  mercury |    88 | 60182 |        60182 |       60182
  neptune | 60182 | 90560 |        90560 |       90560
  pluto   | 90560 | 24491 |        24491 |       24491
- saturn  | 24491 |       |              |         224
+ saturn  | 24491 |       |              |            
  uranus  |       |   224 |          224 |         224
  venus   |   224 |       |              |            
  xyzzy   |       |       |              |            
@@ -5577,13 +5577,13 @@
   name   | orbit | first_value | last_value | nth_value | lead_ignore | lag_ignore
 ---------+-------+-------------+------------+-----------+-------------+------------
  earth   |       |        4332 |       4332 |           |        4332 |          
- jupiter |  4332 |          88 |         88 |           |          88 |          
- mars    |       |        4332 |      60182 |        88 |          88 |       4332
- mercury |    88 |        4332 |      90560 |     60182 |       60182 |       4332
+ jupiter |  4332 |          88 |         88 |           |       60182 |          
+ mars    |       |          88 |      60182 |     60182 |       60182 |       4332
+ mercury |    88 |        4332 |      90560 |     90560 |       90560 |       4332
  neptune | 60182 |          88 |      24491 |     90560 |       90560 |         88
- pluto   | 90560 |          88 |      24491 |     60182 |       24491 |      60182
- saturn  | 24491 |       60182 |        224 |     90560 |         224 |      90560
- uranus  |       |       90560 |        224 |     24491 |         224 |      24491
+ pluto   | 90560 |          88 |      24491 |     60182 |       60182 |      60182
+ saturn  | 24491 |       60182 |        224 |     90560 |       90560 |      90560
+ uranus  |       |       90560 |        224 |     24491 |       24491 |      24491
  venus   |   224 |       24491 |      24491 |           |             |      24491
  xyzzy   |       |         224 |        224 |           |             |        224
 (10 rows)
@@ -5646,14 +5646,14 @@
   name   | orbit | first_value | last_value | nth_value | lead_ignore | lag_ignore
 ---------+-------+-------------+------------+-----------+-------------+------------
  earth   |       |             |            |           |          88 |          
- jupiter |       |          88 |         88 |           |          88 |          
- mars    |       |          88 |      60182 |     60182 |          88 |          
+ jupiter |       |          88 |         88 |           |       60182 |          
+ mars    |       |          88 |      60182 |     60182 |       60182 |          
  mercury |    88 |          88 |      90560 |     60182 |       60182 |          
- neptune | 60182 |          88 |      24491 |     60182 |       90560 |         88
- pluto   | 90560 |          88 |      24491 |     60182 |       24491 |      60182
- saturn  | 24491 |       60182 |        224 |     90560 |         224 |      90560
- uranus  |       |       90560 |        224 |     24491 |         224 |      24491
- venus   |   224 |       24491 |        224 |       224 |             |      24491
+ neptune | 60182 |          88 |      24491 |     60182 |       60182 |         88
+ pluto   | 90560 |          88 |      24491 |     60182 |       60182 |      60182
+ saturn  | 24491 |       60182 |        224 |     90560 |       90560 |      90560
+ uranus  |       |       90560 |        224 |     24491 |       24491 |      24491
+ venus   |   224 |       24491 |        224 |       224 |         224 |      24491
  xyzzy   |       |         224 |        224 |           |             |        224
 (10 rows)
 

На чт, 23.01.2025 г. в 6:25 Tatsuo Ishii <ishii@postgresql.org> написа:
> Hello,
> I also played with the v4 patch and it produces correct result:
> test=# SELECT x,y,lead(y) IGNORE NULLS OVER (ORDER BY x) FROM
> (VALUES(1,NULL),(2,2),(3,NULL)) AS v(x,y);
>  x | y | lead
> ---+---+------
>  1 |   |    2
>  2 | 2 |
>  3 |   |
> (3 rows)
>
> test=#
> It is from today's git, clean compile and install with only v4 patch
> applied, make check also passes without errors.

I guess you are just lucky. In my case I enabled --enable-cassert to
build PostgreSQL and it automatically turn on CLOBBER_FREED_MEMORY and
freed memory area is scrambled. If I look the patch closer, I found a
problem:

+void
+WinCheckAndInitializeNullTreatment(WindowObject winobj,
:
:
+               winobj->win_nonnulls = palloc_array(int64, 16);

WinCheckAndInitializeNullTreatment is called in each built-in window
function. Window functions are called in the per tuple memory context,
which means win_nonnulls disappears when next tuple is supplied to the
window function. If my understanding is correct, winobj->win_nonnulls
needs to survive across processing tuples.

Best reagards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Tatsuo Ishii
Дата:
> Hi,
> I was able to reproduce exactly the problem, with clean compile
> and --enable-cassert:
> test=# SELECT x,y,lead(y) IGNORE NULLS OVER (ORDER BY x) FROM
> (VALUES(1,NULL),(2,2),(3,NULL)) AS v(x,y);
>  x | y | lead
> ---+---+------
>  1 |   |    2
>  2 | 2 |    2
>  3 |   |    2
> (3 rows)
> 
> test=#
> Also, make check errors out at window test (without --enable-cassert it was
> passed in previous compile):

Yeah, same here.

Another possible problem is, probably the code does not work well if
there are multiple partitions. Since win_nonnulls stores currentpos in
a partition, when the partition ends, win_nonnulls needs to be
reset. Otherwise, it mistakenly represents currentpos in the previous
partition.

Best reagards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp



Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Tatsuo Ishii
Дата:
> The attached patch should fix both of these. I've added extra tests
> with a PARTITION BY in the window clause to test for multiple
> partitions.

I have looked through the v5 patch. Here are review comments.

From 5268754b33103fefc511b57ec546103899f70dbe Mon Sep 17 00:00:00 2001
From: Oliver Ford <ojford@gmail.com>
Date: Thu, 23 Jan 2025 20:11:17 +0000
Subject: [PATCH] :ignore nulls

---
diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index 9a1acce2b5..d93a44633e 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -69,6 +69,10 @@ typedef struct WindowObjectData
     int            readptr;        /* tuplestore read pointer for this fn */
     int64        markpos;        /* row that markptr is positioned on */
     int64        seekpos;        /* row that readptr is positioned on */
+    int            ignore_nulls;    /* ignore nulls */
+    int64       *win_nonnulls;    /* tracks non-nulls in ignore nulls mode */

After ignore_nulls, there will be a 4-byte hole because win_nonnulls
is an 8-byte variable. It would be better to swap them.
 
@@ -1263,6 +1268,15 @@ begin_partition(WindowAggState *winstate)
 
             winobj->markpos = -1;
             winobj->seekpos = -1;
+
+
+            /* reallocate null check */
+            if (perfuncstate->winobj->ignore_nulls == IGNORE_NULLS)
+            {
+                perfuncstate->winobj->win_nonnulls = palloc_array(int64, 16);
+                perfuncstate->winobj->nonnulls_size = 16;

Those 2 lines above are not necessary. Since win_nonnulls are
allocated in ExecInitWindowAgg() in the per query query context, it
survives across partitions. You only need initialize nonnulls_len to
0.

+                perfuncstate->winobj->nonnulls_len = 0;
+            }
         }
     }
 
@@ -1383,7 +1397,9 @@ release_partition(WindowAggState *winstate)
 
         /* Release any partition-local state of this window function */
         if (perfuncstate->winobj)
+        {
             perfuncstate->winobj->localmem = NULL;
+        }

You accidentally added unnecessary curly braces.

@@ -2679,6 +2698,13 @@ ExecInitWindowAgg(WindowAgg *node, EState *estate, int eflags)
             winobj->argstates = wfuncstate->args;
             winobj->localmem = NULL;
             perfuncstate->winobj = winobj;
+            winobj->ignore_nulls = wfunc->ignore_nulls;
+            if (winobj->ignore_nulls == PARSER_IGNORE_NULLS)
+            {
+                winobj->win_nonnulls = palloc_array(int64, 16);
+                winobj->nonnulls_size = 16;
+                winobj->nonnulls_len = 0;
+            }

I don't like to see two "16" here. It would better to use #define or
something.

It will be better to declare the prototype of increment_nonnulls,
ignorenulls_getfuncarginpartition, and ignorenulls_getfuncarginframe
in the begging of the file as other static functions already do.

+/*
+ * ignorenulls_getfuncarginframe
+ * For IGNORE NULLS, get the next nonnull value in the frame, moving forward or backward
+ * until we find a value or reach the frame's end.
+ */
+static Datum
+ignorenulls_getfuncarginframe(WindowObject winobj, int argno,

Do you assume that win_nonnulls is sorted by pos? I think it's
necessarily true that pos in win_nonnulls array is sorted. Is that ok?

+    /*
+     * Store previous rows. Only possible in SEEK_HEAD mode
+     */
+    for (i = 0; i < winobj->nonnulls_len; ++i)
+    {
+        int            inframe;
+
+        if (winobj->win_nonnulls[i] < winobj->markpos)

There are too many "winobj->win_nonnulls[i]". You could assign to a
variable "winobj->win_nonnulls[i]" and use the variable.

+            continue;
+        if (!window_gettupleslot(winobj, winobj->win_nonnulls[i], slot))
+            continue;
+
+        inframe = row_is_in_frame(winstate, winobj->win_nonnulls[i], slot);
+        if (inframe <= 0)
+        {
+            if (inframe == -1 && set_mark)
+                WinSetMarkPosition(winobj, winobj->win_nonnulls[i]);

I think in most cases inframe returns 0 and WinSetMarkPosition is not
called. What use case do you have in your mind when inframe is -1?

+check_frame:
+    do
+    {
+        int            inframe;
+
+        if (!window_gettupleslot(winobj, abs_pos, slot))
+            goto out_of_frame;
+
+        inframe = row_is_in_frame(winstate, abs_pos, slot);
+        if (inframe == -1)
+            goto out_of_frame;
+        else if (inframe == 0)
+            goto advance;
+
+        gottuple = window_gettupleslot(winobj, abs_pos, slot);

Do you really need to call window_gettupleslot here? It's already
called above.

--- a/src/include/nodes/primnodes.h
+++ b/src/include/nodes/primnodes.h
@@ -576,6 +576,18 @@ typedef struct GroupingFunc
  * Collation information is irrelevant for the query jumbling, as is the
  * internal state information of the node like "winstar" and "winagg".
  */
+
+/*
+ * Null Treatment options. If specified, initially set to PARSER_IGNORE
+ * or PARSER_RESPECT. PARSER_IGNORE_NULLS is then converted to IGNORE_NULLS
+ * if the window function allows the null treatment clause.
+ */
+#define IGNORE_NULLS 4
+#define RESPECT_NULLS 3
+#define PARSER_IGNORE_NULLS 2
+#define PARSER_RESPECT_NULLS 1
+#define NO_NULLTREATMENT 0

This looks strange to me. Why do you start the define value from 4
down to 0? Also there is no place to use RESPECT_NULLS. Do we need it?

Best reagards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp




Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Tatsuo Ishii
Дата:
>> +/*
>> + * ignorenulls_getfuncarginframe
>> + * For IGNORE NULLS, get the next nonnull value in the frame, moving forward or backward
>> + * until we find a value or reach the frame's end.
>> + */
>> +static Datum
>> +ignorenulls_getfuncarginframe(WindowObject winobj, int argno,
>>
>> Do you assume that win_nonnulls is sorted by pos? I think it's
>> necessarily true that pos in win_nonnulls array is sorted. Is that ok?
> 
> Yes it must be sorted on my understanding of the code.

Then the patch has a problem. I ran a query below and examined
win_nonnulls.  It seems it was not sorted out.

SELECT
  x,y,
  nth_value(y,1) IGNORE NULLS OVER w
FROM (VALUES (1,1), (2,2), (3,NULL), (4,4), (5,NULL), (6,6), (7,7)) AS t(x,y)
WINDOW w AS (ORDER BY x ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING EXCLUDE CURRENT ROW);


(gdb) p *winobj->win_nonnulls @ winobj->nonnulls_len
$8 = {1, 0, 3, 6, 5}

Best reagards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp



Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Oliver Ford
Дата:
On Tue, Jan 28, 2025 at 9:02 AM Tatsuo Ishii <ishii@postgresql.org> wrote:
>
> >> +/*
> >> + * ignorenulls_getfuncarginframe
> >> + * For IGNORE NULLS, get the next nonnull value in the frame, moving forward or backward
> >> + * until we find a value or reach the frame's end.
> >> + */
> >> +static Datum
> >> +ignorenulls_getfuncarginframe(WindowObject winobj, int argno,
> >>
> >> Do you assume that win_nonnulls is sorted by pos? I think it's
> >> necessarily true that pos in win_nonnulls array is sorted. Is that ok?
> >
> > Yes it must be sorted on my understanding of the code.
>
> Then the patch has a problem. I ran a query below and examined
> win_nonnulls.  It seems it was not sorted out.
>
> SELECT
>   x,y,
>   nth_value(y,1) IGNORE NULLS OVER w
> FROM (VALUES (1,1), (2,2), (3,NULL), (4,4), (5,NULL), (6,6), (7,7)) AS t(x,y)
> WINDOW w AS (ORDER BY x ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING EXCLUDE CURRENT ROW);
>
>
> (gdb) p *winobj->win_nonnulls @ winobj->nonnulls_len
> $8 = {1, 0, 3, 6, 5}
>

I've looked at it again and I think the code is correct, but I
miswrote that the array needs to be sorted. The above query returns:
 x | y | nth_value
---+---+-----------
 1 | 1 |         2
 2 | 2 |         1
 3 |   |         2
 4 | 4 |
 5 |   |         4
 6 | 6 |         7
 7 | 7 |         6
(7 rows)

This is correct, for values of x:

1: The first non-null value of y is at position 0, however we have
EXCLUDE CURRENT ROW so it picks the next non-null value at position 1
and stores it in the array, returning 2.
2: We can now take the first non-null value of y at position 0 and
store it in the array, returning 1.
3. We take 1 preceding, using the position stored in the array, returning 2.
4. 1 preceding and 1 following are both null, and we exclude the
current row, so returning null.
5. 1 preceding is at position 3, store it in the array, returning 4.
6. 1 preceding is null and we exclude the current row, so store
position 6 in the array, returning 7.
7. 1 preceding is at position 5, store it in the array and return 6.

It will be unordered when the EXCLUDE clause is used but the code
should handle this correctly.



Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Tatsuo Ishii
Дата:
> I've looked at it again and I think the code is correct,

Good news! I will look into your explanation.

> but I
> miswrote that the array needs to be sorted. The above query returns:
>  x | y | nth_value
> ---+---+-----------
>  1 | 1 |         2
>  2 | 2 |         1
>  3 |   |         2
>  4 | 4 |
>  5 |   |         4
>  6 | 6 |         7
>  7 | 7 |         6
> (7 rows)
> 
> This is correct, for values of x:
> 
> 1: The first non-null value of y is at position 0, however we have
> EXCLUDE CURRENT ROW so it picks the next non-null value at position 1
> and stores it in the array, returning 2.
> 2: We can now take the first non-null value of y at position 0 and
> store it in the array, returning 1.
> 3. We take 1 preceding, using the position stored in the array, returning 2.
> 4. 1 preceding and 1 following are both null, and we exclude the
> current row, so returning null.
> 5. 1 preceding is at position 3, store it in the array, returning 4.
> 6. 1 preceding is null and we exclude the current row, so store
> position 6 in the array, returning 7.
> 7. 1 preceding is at position 5, store it in the array and return 6.
> 
> It will be unordered when the EXCLUDE clause is used but the code
> should handle this correctly.

Best reagards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp




Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Tatsuo Ishii
Дата:
> I've looked at it again and I think the code is correct, but I
> miswrote that the array needs to be sorted. The above query returns:
>  x | y | nth_value
> ---+---+-----------
>  1 | 1 |         2
>  2 | 2 |         1
>  3 |   |         2
>  4 | 4 |
>  5 |   |         4
>  6 | 6 |         7
>  7 | 7 |         6
> (7 rows)
> 
> This is correct, for values of x:
> 
> 1: The first non-null value of y is at position 0, however we have
> EXCLUDE CURRENT ROW so it picks the next non-null value at position 1
> and stores it in the array, returning 2.
> 2: We can now take the first non-null value of y at position 0 and
> store it in the array, returning 1.
> 3. We take 1 preceding, using the position stored in the array, returning 2.
> 4. 1 preceding and 1 following are both null, and we exclude the
> current row, so returning null.
> 5. 1 preceding is at position 3, store it in the array, returning 4.
> 6. 1 preceding is null and we exclude the current row, so store
> position 6 in the array, returning 7.
> 7. 1 preceding is at position 5, store it in the array and return 6.
> 
> It will be unordered when the EXCLUDE clause is used but the code
> should handle this correctly.

I ran this query (not using IGNORE NULLS) and get a result.

SELECT
  x,
  nth_value(x,2) OVER w
FROM generate_series(1,5) g(x)
WINDOW w AS (ORDER BY x ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING EXCLUDE CURRENT ROW);
 x | nth_value 
---+-----------
 1 |         3
 2 |         3
 3 |         2
 4 |         3
 5 |         4
(5 rows)

Since there's no NULL in x column, I expected the same result using
IGNORE NULLS, but it was not:

SELECT
  x,
nth_value(x,2) IGNORE NULLS OVER w
FROM generate_series(1,5) g(x)
WINDOW w AS (ORDER BY x ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING EXCLUDE CURRENT ROW);
 x | nth_value 
---+-----------
 1 |         3
 2 |         4
 3 |         4
 4 |         3
 5 |         4
(5 rows)

I suspect the difference is in the code path of
ignorenulls_getfuncarginframe and the code path in
WinGetFuncArgInFrame, which takes care of EXCLUDE like this.

                case FRAMEOPTION_EXCLUDE_CURRENT_ROW:
                    if (abs_pos >= winstate->currentpos &&
                        winstate->currentpos >= winstate->frameheadpos)
                        abs_pos++;

Best reagards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp



Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Tatsuo Ishii
Дата:
> Attached version doesn't use the nonnulls array if an Exclude is
> specified, as I think it's not going to work with exclusions (as it's
> only an optimization, this is ok and can be taken out entirely if you
> prefer). I've also added your tests above to the tests.

I applied the v7 patch and ran regression and tap test. There was no
errors. Great!

BTW, I noticed that in the code path where
ignorenulls_getfuncarginframe() is called, WinSetMarkPosition() is
never called?

Best reagards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp



Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Tatsuo Ishii
Дата:
>> BTW, I noticed that in the code path where
>> ignorenulls_getfuncarginframe() is called, WinSetMarkPosition() is
>> never called?
>>
>> Attached version uses the mark_pos at the end.

I did simple performance test against v8.

EXPLAIN ANALYZE
SELECT
  x,
nth_value(x,2) IGNORE NULLS OVER w
FROM generate_series(1,$i) g(x)
WINDOW w AS (ORDER BY x ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING);

I changed $i = 1k, 2k, 3k, 4k, 5k... 10k and got this:

Number    Time (ms)
of rows    
----------------
1000    28.977
2000    96.556
3000    212.019
4000    383.615
5000    587.05
6000    843.23
7000    1196.177
8000    1508.52
9000    1920.593
10000    2514.069

As you can see, when the number of rows = 1k, it took 28 ms. For 10k
rows, it took 2514 ms, which is 86 times slower than the 1k case. Can
we enhance this?

Graph attached.

Best reagards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

Вложения

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Oliver Ford
Дата:
On Fri, Feb 28, 2025 at 11:49 AM Tatsuo Ishii <ishii@postgresql.org> wrote:
>> BTW, I noticed that in the code path where
>> ignorenulls_getfuncarginframe() is called, WinSetMarkPosition() is
>> never called?
>>
>> Attached version uses the mark_pos at the end.

I did simple performance test against v8.

EXPLAIN ANALYZE
SELECT
  x,
nth_value(x,2) IGNORE NULLS OVER w
FROM generate_series(1,$i) g(x)
WINDOW w AS (ORDER BY x ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING);

I changed $i = 1k, 2k, 3k, 4k, 5k... 10k and got this:

Number  Time (ms)
of rows
----------------
1000    28.977
2000    96.556
3000    212.019
4000    383.615
5000    587.05
6000    843.23
7000    1196.177
8000    1508.52
9000    1920.593
10000   2514.069

As you can see, when the number of rows = 1k, it took 28 ms. For 10k
rows, it took 2514 ms, which is 86 times slower than the 1k case. Can
we enhance this?


Attached version removes the non-nulls array. That seems to speed everything up.  Running the above query with 1 million rows averages 450ms, similar when using lead/lag.
Вложения

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Tatsuo Ishii
Дата:
> Attached version removes the non-nulls array. That seems to speed
> everything up.  Running the above query with 1 million rows averages 450ms,
> similar when using lead/lag.

Great. However, CFbot complains about the patch:

https://cirrus-ci.com/task/6364194477441024

Best reagards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp



Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Oliver Ford
Дата:
On Sun, Mar 9, 2025 at 6:40 AM Tatsuo Ishii <ishii@postgresql.org> wrote:
> Attached version removes the non-nulls array. That seems to speed
> everything up.  Running the above query with 1 million rows averages 450ms,
> similar when using lead/lag.

Great. However, CFbot complains about the patch:

https://cirrus-ci.com/task/6364194477441024


Attached fixes the headerscheck locally. 
Вложения

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Oliver Ford
Дата:


On Sun, 9 Mar 2025, 20:07 Oliver Ford, <ojford@gmail.com> wrote:
On Sun, Mar 9, 2025 at 6:40 AM Tatsuo Ishii <ishii@postgresql.org> wrote:
> Attached version removes the non-nulls array. That seems to speed
> everything up.  Running the above query with 1 million rows averages 450ms,
> similar when using lead/lag.

Great. However, CFbot complains about the patch:

https://cirrus-ci.com/task/6364194477441024


Attached fixes the headerscheck locally.

v11 attached because the previous version was broken by commit 8b1b342
Вложения

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Krasiyan Andreev
Дата:
Hi,
Patch applies and compiles, all included tests passed and after the latest fixes for non-nulls array, performance is near to lead/lag without support of "ignore nulls".
I have been using the last version for more than one month in a production environment with real data and didn't find any bugs, so It is ready for committer status.

На чт, 13.03.2025 г. в 9:49 Oliver Ford <ojford@gmail.com> написа:


On Sun, 9 Mar 2025, 20:07 Oliver Ford, <ojford@gmail.com> wrote:
On Sun, Mar 9, 2025 at 6:40 AM Tatsuo Ishii <ishii@postgresql.org> wrote:
> Attached version removes the non-nulls array. That seems to speed
> everything up.  Running the above query with 1 million rows averages 450ms,
> similar when using lead/lag.

Great. However, CFbot complains about the patch:

https://cirrus-ci.com/task/6364194477441024


Attached fixes the headerscheck locally.

v11 attached because the previous version was broken by commit 8b1b342

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Tatsuo Ishii
Дата:
> Hi,
> Patch applies and compiles, all included tests passed and after the latest
> fixes for non-nulls array, performance is near to lead/lag without support
> of "ignore nulls".
> I have been using the last version for more than one month in a production
> environment with real data and didn't find any bugs, so It is ready for
> committer status.

One thing I worry about the patch is, now the non-nulls array
optimization was removed.  Since then I have been thinking about if
there could be other way to optimize searching for non null rows.

Best reagards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp



Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Tatsuo Ishii
Дата:
> One thing I worry about the patch is, now the non-nulls array
> optimization was removed.  Since then I have been thinking about if
> there could be other way to optimize searching for non null rows.

Here is the v12 patch to implement the optimization on top of Oliver's
v11 patch. Only src/backend/executor/nodeWindowAgg.c was modified
(especially ignorenulls_getfuncarginframe). In the patch I created
2-bit not null information array, representing following status for
each row:

UNKNOWN: the row is not determined whether it's NULL or NOT yet.
     This is the initial value.
NULL: the row has been determined to be NULL.
NOT NULL: the row has been determined to be NOT NULL.

In ignorenulls_getfuncarginframe:

For the first time window function visits a row in a frame, the row is
fetched using window_gettupleslot() and it is checked whether it is in
the frame using row_is_in_frame(). If it's in the frame and the
information in the array is UNKNOWN, ExecEvalExpr() is executed to
find out if the expression on the function argument is NULL or
not. And the result (NULL or NOT NULL) is stored in the array.

If the information in the array is not UNKNOWN, we can skip calling
ExecEvalExpr() because the information is already in the array.

Note that I do not skip calling window_gettupleslot() and
row_is_in_frame(), skip only calling ExecEvalExpr(), because whether a
row is in a frame or not could be changing as the current row position
moves while processing window functions.

With this technique I observed around 40% speed up in my environment
using the script attached, comparing with Oliver's v11 patch.

v11:
rows    duration (msec)
1000    41.019 
2000    148.957 
3000    248.291 
4000    442.478 
5000    687.395

v12:
rows    duration (msec)
1000    27.515 
2000    78.913 
3000    174.737 
4000    311.412 
5000    482.156 

The patch is now generated using the standard git format-patch.  Also
I have slightly adjusted the coding style so that it aligns with the
one used in nodeWindowAgg.c, and ran pgindent.

Note that I have not modified ignorenulls_getfuncarginpartition yet. I
think we could optimize it using the not null info infrastructure as
well. Will come up with it.

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
for i in 1000 2000 3000 4000 5000
do
    echo "$i rows: "
    pos=`expr $i / 2`
    psql -a test <<EOF
\timing
explain analyze
SELECT
  x,
nth_value(x,$pos) IGNORE NULLS OVER w
FROM generate_series(1,$i) g(x)
WINDOW w AS (ORDER BY x ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING);
EOF

done | egrep "rows:|Time:" | egrep -v "Planning|Execution"|
    sed -e 's/rows: *//' -e 's/Time: //' -e 's/ms//'






Вложения

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Tatsuo Ishii
Дата:
> Here is the v12 patch to implement the optimization on top of Oliver's
> v11 patch. Only src/backend/executor/nodeWindowAgg.c was modified
> (especially ignorenulls_getfuncarginframe). In the patch I created
> 2-bit not null information array, representing following status for
> each row:
> 
> UNKNOWN: the row is not determined whether it's NULL or NOT yet.
>      This is the initial value.
> NULL: the row has been determined to be NULL.
> NOT NULL: the row has been determined to be NOT NULL.
> 
> In ignorenulls_getfuncarginframe:
> 
> For the first time window function visits a row in a frame, the row is
> fetched using window_gettupleslot() and it is checked whether it is in
> the frame using row_is_in_frame(). If it's in the frame and the
> information in the array is UNKNOWN, ExecEvalExpr() is executed to
> find out if the expression on the function argument is NULL or
> not. And the result (NULL or NOT NULL) is stored in the array.
> 
> If the information in the array is not UNKNOWN, we can skip calling
> ExecEvalExpr() because the information is already in the array.
> 
> Note that I do not skip calling window_gettupleslot() and
> row_is_in_frame(), skip only calling ExecEvalExpr(), because whether a
> row is in a frame or not could be changing as the current row position
> moves while processing window functions.
> 
> With this technique I observed around 40% speed up in my environment
> using the script attached, comparing with Oliver's v11 patch.
> 
> v11:
> rows    duration (msec)
> 1000    41.019 
> 2000    148.957 
> 3000    248.291 
> 4000    442.478 
> 5000    687.395
> 
> v12:
> rows    duration (msec)
> 1000    27.515 
> 2000    78.913 
> 3000    174.737 
> 4000    311.412 
> 5000    482.156 
> 
> The patch is now generated using the standard git format-patch.  Also
> I have slightly adjusted the coding style so that it aligns with the
> one used in nodeWindowAgg.c, and ran pgindent.
> 
> Note that I have not modified ignorenulls_getfuncarginpartition yet. I
> think we could optimize it using the not null info infrastructure as
> well. Will come up with it.

Attached is the v13 patch to address this: i.e. optimize window
functions (lead/lag) working in a partition. In summary I get 40x
speed up comparing with v12 patch. Here is the test script.

EXPLAIN ANALYZE
SELECT lead(x, 5000) IGNORE NULLS OVER ()
              FROM generate_series(1,10000) g(x);

This looks for the 5k th row in a partition including 10k rows.
The average duration of 3 trials are:

v12: 2563.665 ms
v13: 126.259 ms

So I got 40x speed up with the v13 patch. In v12, we needed to scan
10k row partition over and over again. In v13 I used the same NULL/NOT
NULL cache infrastructure created in v12, and we need to scan the
partition only once. This is the reason for the speed up.

Also I removed ignorenulls_getfuncarginpartition(), which was the work
horse for null treatment for window functions working for partitions
in v12 patch. Basically it was a copy and modified version of
WinGetFuncArgInPartition(), thus had quite a few code duplication. In
v13, instead I modified WinGetFuncArgInPartition() so that it can
handle directly null treatment procedures.

BTW I am still not satisfied by the performance improvement for window
functions for frames, that was only 40%. I will study the code to look
for more optimization.

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

Вложения

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Tatsuo Ishii
Дата:
> BTW I am still not satisfied by the performance improvement for window
> functions for frames, that was only 40%. I will study the code to look
> for more optimization.

So I come up with more optimization for window functions working on
frames (i.e. first_value, last_value and nth_value). Attached v14
patch does it.

There are 3 major functions used here.

1) window_gettupleslot (get a row)
2) row_is_in_frame (check whether row is in frame or not)
3) ExecEvalExpr (evaluate arg on the row)

In v12 (and v13), we eliminate #3 in some cases but the saving was
only 40%. In v14, I found some cases where we don't need to call
#1. row_is_in_frame requires a row ("tuple" argument), which is
provided by #1. However row_is_in_frame actually uses the row argument
only when frame clause is "RANGE" or "GROUPS" and frame end is
"CURRENT ROW". In other cases it does not use "tuple" argument at
all. So I check the frame clause and the frame end, and if they are
not the case, I can omit #1. Plus if the not null cache for the row
has been already created, we can omit #3 as well. The optimization
contributes to the performance. I observe 2.7x (1k rows case) to 5.2x
(3k rows case) speed up when I compare the performance of v13 patch
and v14 patch using the same script (see attached).

v13:
rows    duration (msec)
1000    34.740
2000    91.169
3000    205.847
4000    356.142
5000    557.063 

v14:
rows    duration (msec)
1000    12.807
2000    21.782
3000    39.248
4000     69.123
5000     101.220

I am not sure how the case where frame clause is "RANGE" or "GROUPS"
and frame end is "CURRENT ROW" is majority of window function use
cases. If it's majority, the optimization in v14 does not help much
because v14 does not optimize the case.  However if it's not, the v14
patch is close to commitable form, I think. Comments are welcome.

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
for i in 1000 2000 3000 4000 5000
do
    echo "$i rows: "
    pos=`expr $i / 2`
    psql -a test <<EOF
\timing
explain analyze
SELECT
  x,
nth_value(x,$pos) IGNORE NULLS OVER w
FROM generate_series(1,$i) g(x)
WINDOW w AS (ORDER BY x ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING);
EOF

done | egrep "rows:|Time:" | egrep -v "Planning|Execution"|
    sed -e 's/rows: *//' -e 's/Time: //' -e 's/ms//'






Вложения

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Krasiyan Andreev
Дата:
Hi,
Patch applies and compiles, all included tests passed and performance gain is really impressive. I have been using the latest versions for months with real data and didn't find any bugs, so It is definitely ready for committer status.

На пн, 30.06.2025 г. в 8:26 Tatsuo Ishii <ishii@postgresql.org> написа:
Attached is the v15 patch to fix CFbot complains.
Other than that, nothing has been changed since v14.

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Tatsuo Ishii
Дата:
Krasiyan,

> Hi,
> Patch applies and compiles, all included tests passed and performance gain
> is really impressive. I have been using the latest versions for months with
> real data and didn't find any bugs, so It is definitely ready for committer
> status.

Thanks for testing the patch. The CF status has been already set to
"ready for committer". I just changed the target version from 18 to
19.

> I am not sure how the case where frame clause is "RANGE" or "GROUPS"
> and frame end is "CURRENT ROW" is majority of window function use
> cases. If it's majority, the optimization in v14 does not help much
> because v14 does not optimize the case.  However if it's not, the v14
> patch is close to commitable form, I think. Comments are welcome.

Have you tested cases where the frame option is "RANGE" or "GROUPS"
and the frame end is "CURRENT ROW"? I am asking because in these cases
the optimization in the v14 (and v15) patches do not apply and you may
not be satisfied by the performance.

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp




Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Tatsuo Ishii
Дата:
Attached is the v16 patch.

In this patch I have changed row_is_in_frame() API from:

static int    row_is_in_frame(WindowAggState *winstate, int64 pos,
                            TupleTableSlot *slot);

to:

static int    row_is_in_frame(WindowObject winobj, int64 pos,
                            TupleTableSlot *slot, bool fetch_tuple);

The function is used to decide whether a row specified by pos is in a
frame or not. Previously we needed to always pass "slot" parameter
which is the row in question, fetched by window_gettupleslot. If
IGNORE NULLS option is not passed to window functions, this is fine
because they need to return the row anyway.

However if IGNORE NULLS specified, we need to throw away null rows
until we find the non null row requested by the caller. In reality,
not in all window frames it is required to pass a row to
row_is_in_frame: only when specific frame options are specified. For
example RANGE or GROUP options plus CURRENT ROW frame end option. In
previous patch, I explicitly checked these frame options before
calling row_is_in_frame. However I dislike the way because it's a
layer abstraction violation.

So in this patch I added "fetch_tuple" option to row_is_in_frame so
that it fetches row itself when necessary. A caller now don't need to
fetch the row to pass if fetch_tuple is false.

This way, not only we can avoid the layer violation problem, but
performance is enhanced because tuple is fetched only when it's
necessary.

Note that now the first argument of row_is_in_frame has been changed
from WindowAggState to WindowObject so that row_is_in_frame can call
window_gettupleslot inside.

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

Вложения

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Tatsuo Ishii
Дата:
Currently the patch set include some regression test additions. I
wanted to expand the test coverage and ended up an idea: generate new
test cases from the existing window function regression test
(window.sql).

Attached "insert_ignore_nulls.sh" script reads window.sql and inserts
"ignore nulls" before "over" clause of each window functions that
accept IGNORE NULLS option. This way, although the generated test
cases do not cover the case where NULL is included, at least covers
all non NULL cases, which is better than nothing, I think.

I replaced the existing window.sql with the modified one, and ran the
regression test. Indeed the test failed because expected file is for
non IGNORE NULLS options. However, the differences should be just for
SQL statements, not the output of the SQL statements since the data
set used does not include NULLs. I did an eyeball check the diff and
the result was what I expected.

For those who are interested this test, I attached some files.

insert_ignore_nulls.sh: shell script to insert "ignore nulls"
window.sql: modified regression script by insert_ignore_nulls.sh
window.diff: diff of original window.out and modified window.out

Question is, how could we put this kind of test into core if it worth
the effort? The simplest idea is just adding the modified window.sql
to the end of existing window.sql and update window.out. Thoughts?

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp
#! /bin/sh

# Script to generate test cases for IGNORE NULLS option of some window
# functions (lead, lag, first_value, last_value, and nth_value) in the
# regression test SQL (window.sql).  This script supposed to read
# window.sql from stdin and inserts "ignore null" before "over" on all
# occurrences of the window functions that accept IGNORE NULLS, then
# print it to stdout.  Since currently the data set used for the
# existing regression test (window.sql) does not include NULLs, the
# test result using the results of this script is expected be
# identical to the existing results of existing window.sql (of course
# except the SQL statements).
sed -r \
    -e "s/lead\(([^)]*)\) over/lead(\1) ignore nulls over/g" \
    -e "s/lag\(([^)]*)\) over/lag(\1) ignore nulls over/g" \
    -e "s/first_value\(([^)]*)\) over/first_value(\1) ignore nulls over/g" \
    -e "s/last_value\(([^)]*)\) over/last_value(\1) ignore nulls over/g" \
    -e "s/nth_value\(([^)]*)\) over/nth_value(\1) ignore nulls over/g"
--
-- WINDOW FUNCTIONS
--

CREATE TEMPORARY TABLE empsalary (
    depname varchar,
    empno bigint,
    salary int,
    enroll_date date
);

INSERT INTO empsalary VALUES
('develop', 10, 5200, '2007-08-01'),
('sales', 1, 5000, '2006-10-01'),
('personnel', 5, 3500, '2007-12-10'),
('sales', 4, 4800, '2007-08-08'),
('personnel', 2, 3900, '2006-12-23'),
('develop', 7, 4200, '2008-01-01'),
('develop', 9, 4500, '2008-01-01'),
('sales', 3, 4800, '2007-08-01'),
('develop', 8, 6000, '2006-10-01'),
('develop', 11, 5200, '2007-08-15');

SELECT depname, empno, salary, sum(salary) OVER (PARTITION BY depname) FROM empsalary ORDER BY depname, salary;

SELECT depname, empno, salary, rank() OVER (PARTITION BY depname ORDER BY salary) FROM empsalary;

-- with GROUP BY
SELECT four, ten, SUM(SUM(four)) OVER (PARTITION BY four), AVG(ten) FROM tenk1
GROUP BY four, ten ORDER BY four, ten;

SELECT depname, empno, salary, sum(salary) OVER w FROM empsalary WINDOW w AS (PARTITION BY depname);

SELECT depname, empno, salary, rank() OVER w FROM empsalary WINDOW w AS (PARTITION BY depname ORDER BY salary) ORDER BY
rank()OVER w;
 

-- empty window specification
SELECT COUNT(*) OVER () FROM tenk1 WHERE unique2 < 10;

SELECT COUNT(*) OVER w FROM tenk1 WHERE unique2 < 10 WINDOW w AS ();

-- no window operation
SELECT four FROM tenk1 WHERE FALSE WINDOW w AS (PARTITION BY ten);

-- cumulative aggregate
SELECT sum(four) OVER (PARTITION BY ten ORDER BY unique2) AS sum_1, ten, four FROM tenk1 WHERE unique2 < 10;

SELECT row_number() OVER (ORDER BY unique2) FROM tenk1 WHERE unique2 < 10;

SELECT rank() OVER (PARTITION BY four ORDER BY ten) AS rank_1, ten, four FROM tenk1 WHERE unique2 < 10;

SELECT dense_rank() OVER (PARTITION BY four ORDER BY ten), ten, four FROM tenk1 WHERE unique2 < 10;

SELECT percent_rank() OVER (PARTITION BY four ORDER BY ten), ten, four FROM tenk1 WHERE unique2 < 10;

SELECT cume_dist() OVER (PARTITION BY four ORDER BY ten), ten, four FROM tenk1 WHERE unique2 < 10;

SELECT ntile(3) OVER (ORDER BY ten, four), ten, four FROM tenk1 WHERE unique2 < 10;

SELECT ntile(NULL) OVER (ORDER BY ten, four), ten, four FROM tenk1 LIMIT 2;

SELECT lag(ten) OVER (PARTITION BY four ORDER BY ten), ten, four FROM tenk1 WHERE unique2 < 10;

SELECT lag(ten, four) OVER (PARTITION BY four ORDER BY ten), ten, four FROM tenk1 WHERE unique2 < 10;

SELECT lag(ten, four, 0) OVER (PARTITION BY four ORDER BY ten), ten, four FROM tenk1 WHERE unique2 < 10;
SELECT lag(ten, four, 0.7) OVER (PARTITION BY four ORDER BY ten), ten, four FROM tenk1 WHERE unique2 < 10 ORDER BY
four,ten;
 

SELECT lead(ten) OVER (PARTITION BY four ORDER BY ten), ten, four FROM tenk1 WHERE unique2 < 10;

SELECT lead(ten * 2, 1) OVER (PARTITION BY four ORDER BY ten), ten, four FROM tenk1 WHERE unique2 < 10;

SELECT lead(ten * 2, 1, -1) OVER (PARTITION BY four ORDER BY ten), ten, four FROM tenk1 WHERE unique2 < 10;
SELECT lead(ten * 2, 1, -1.4) OVER (PARTITION BY four ORDER BY ten), ten, four FROM tenk1 WHERE unique2 < 10 ORDER BY
four,ten;
 

SELECT first_value(ten) OVER (PARTITION BY four ORDER BY ten), ten, four FROM tenk1 WHERE unique2 < 10;

-- last_value returns the last row of the frame, which is CURRENT ROW in ORDER BY window.
SELECT last_value(four) OVER (ORDER BY ten), ten, four FROM tenk1 WHERE unique2 < 10;

SELECT last_value(ten) OVER (PARTITION BY four), ten, four FROM
    (SELECT * FROM tenk1 WHERE unique2 < 10 ORDER BY four, ten)s
    ORDER BY four, ten;

SELECT nth_value(ten, four + 1) OVER (PARTITION BY four), ten, four
    FROM (SELECT * FROM tenk1 WHERE unique2 < 10 ORDER BY four, ten)s;

SELECT ten, two, sum(hundred) AS gsum, sum(sum(hundred)) OVER (PARTITION BY two ORDER BY ten) AS wsum
FROM tenk1 GROUP BY ten, two;

SELECT count(*) OVER (PARTITION BY four), four FROM (SELECT * FROM tenk1 WHERE two = 1)s WHERE unique2 < 10;

SELECT (count(*) OVER (PARTITION BY four ORDER BY ten) +
  sum(hundred) OVER (PARTITION BY four ORDER BY ten))::varchar AS cntsum
  FROM tenk1 WHERE unique2 < 10;

-- opexpr with different windows evaluation.
SELECT * FROM(
  SELECT count(*) OVER (PARTITION BY four ORDER BY ten) +
    sum(hundred) OVER (PARTITION BY two ORDER BY ten) AS total,
    count(*) OVER (PARTITION BY four ORDER BY ten) AS fourcount,
    sum(hundred) OVER (PARTITION BY two ORDER BY ten) AS twosum
    FROM tenk1
)sub
WHERE total <> fourcount + twosum;

SELECT avg(four) OVER (PARTITION BY four ORDER BY thousand / 100) FROM tenk1 WHERE unique2 < 10;

SELECT ten, two, sum(hundred) AS gsum, sum(sum(hundred)) OVER win AS wsum
FROM tenk1 GROUP BY ten, two WINDOW win AS (PARTITION BY two ORDER BY ten);

-- more than one window with GROUP BY
SELECT sum(salary),
    row_number() OVER (ORDER BY depname),
    sum(sum(salary)) OVER (ORDER BY depname DESC)
FROM empsalary GROUP BY depname;

-- identical windows with different names
SELECT sum(salary) OVER w1, count(*) OVER w2
FROM empsalary WINDOW w1 AS (ORDER BY salary), w2 AS (ORDER BY salary);

-- subplan
SELECT lead(ten, (SELECT two FROM tenk1 WHERE s.unique2 = unique2)) OVER (PARTITION BY four ORDER BY ten)
FROM tenk1 s WHERE unique2 < 10;

-- empty table
SELECT count(*) OVER (PARTITION BY four) FROM (SELECT * FROM tenk1 WHERE FALSE)s;

-- mixture of agg/wfunc in the same window
SELECT sum(salary) OVER w, rank() OVER w FROM empsalary WINDOW w AS (PARTITION BY depname ORDER BY salary DESC);

-- strict aggs
SELECT empno, depname, salary, bonus, depadj, MIN(bonus) OVER (ORDER BY empno), MAX(depadj) OVER () FROM(
    SELECT *,
        CASE WHEN enroll_date < '2008-01-01' THEN 2008 - extract(YEAR FROM enroll_date) END * 500 AS bonus,
        CASE WHEN
            AVG(salary) OVER (PARTITION BY depname) < salary
        THEN 200 END AS depadj FROM empsalary
)s;

-- window function over ungrouped agg over empty row set (bug before 9.1)
SELECT SUM(COUNT(f1)) OVER () FROM int4_tbl WHERE f1=42;

-- window function with ORDER BY an expression involving aggregates (9.1 bug)
select ten,
  sum(unique1) + sum(unique2) as res,
  rank() over (order by sum(unique1) + sum(unique2)) as rank
from tenk1
group by ten order by ten;

-- window and aggregate with GROUP BY expression (9.2 bug)
explain (costs off)
select first_value(max(x)) over (), y
  from (select unique1 as x, ten+four as y from tenk1) ss
  group by y;

-- window functions returning pass-by-ref values from different rows
select x, lag(x, 1) ignore nulls over (order by x), lead(x, 3) ignore nulls over (order by x)
from (select x::numeric as x from generate_series(1,10) x);

-- test non-default frame specifications
SELECT four, ten,
    sum(ten) over (partition by four order by ten),
    last_value(ten) ignore nulls over (partition by four order by ten)
FROM (select distinct ten, four from tenk1) ss;

SELECT four, ten,
    sum(ten) over (partition by four order by ten range between unbounded preceding and current row),
    last_value(ten) ignore nulls over (partition by four order by ten range between unbounded preceding and current
row)
FROM (select distinct ten, four from tenk1) ss;

SELECT four, ten,
    sum(ten) over (partition by four order by ten range between unbounded preceding and unbounded following),
    last_value(ten) ignore nulls over (partition by four order by ten range between unbounded preceding and unbounded
following)
FROM (select distinct ten, four from tenk1) ss;

SELECT four, ten/4 as two,
    sum(ten/4) over (partition by four order by ten/4 range between unbounded preceding and current row),
    last_value(ten/4) ignore nulls over (partition by four order by ten/4 range between unbounded preceding and current
row)
FROM (select distinct ten, four from tenk1) ss;

SELECT four, ten/4 as two,
    sum(ten/4) over (partition by four order by ten/4 rows between unbounded preceding and current row),
    last_value(ten/4) ignore nulls over (partition by four order by ten/4 rows between unbounded preceding and current
row)
FROM (select distinct ten, four from tenk1) ss;

SELECT sum(unique1) over (order by four range between current row and unbounded following),
    unique1, four
FROM tenk1 WHERE unique1 < 10;

SELECT sum(unique1) over (rows between current row and unbounded following),
    unique1, four
FROM tenk1 WHERE unique1 < 10;

SELECT sum(unique1) over (rows between 2 preceding and 2 following),
    unique1, four
FROM tenk1 WHERE unique1 < 10;

SELECT sum(unique1) over (rows between 2 preceding and 2 following exclude no others),
    unique1, four
FROM tenk1 WHERE unique1 < 10;

SELECT sum(unique1) over (rows between 2 preceding and 2 following exclude current row),
    unique1, four
FROM tenk1 WHERE unique1 < 10;

SELECT sum(unique1) over (rows between 2 preceding and 2 following exclude group),
    unique1, four
FROM tenk1 WHERE unique1 < 10;

SELECT sum(unique1) over (rows between 2 preceding and 2 following exclude ties),
    unique1, four
FROM tenk1 WHERE unique1 < 10;

SELECT first_value(unique1) ignore nulls over (ORDER BY four rows between current row and 2 following exclude current
row),
    unique1, four
FROM tenk1 WHERE unique1 < 10;

SELECT first_value(unique1) ignore nulls over (ORDER BY four rows between current row and 2 following exclude group),
    unique1, four
FROM tenk1 WHERE unique1 < 10;

SELECT first_value(unique1) ignore nulls over (ORDER BY four rows between current row and 2 following exclude ties),
    unique1, four
FROM tenk1 WHERE unique1 < 10;

SELECT last_value(unique1) ignore nulls over (ORDER BY four rows between current row and 2 following exclude current
row),
    unique1, four
FROM tenk1 WHERE unique1 < 10;

SELECT last_value(unique1) ignore nulls over (ORDER BY four rows between current row and 2 following exclude group),
    unique1, four
FROM tenk1 WHERE unique1 < 10;

SELECT last_value(unique1) ignore nulls over (ORDER BY four rows between current row and 2 following exclude ties),
    unique1, four
FROM tenk1 WHERE unique1 < 10;

SELECT sum(unique1) over (rows between 2 preceding and 1 preceding),
    unique1, four
FROM tenk1 WHERE unique1 < 10;

SELECT sum(unique1) over (rows between 1 following and 3 following),
    unique1, four
FROM tenk1 WHERE unique1 < 10;

SELECT sum(unique1) over (rows between unbounded preceding and 1 following),
    unique1, four
FROM tenk1 WHERE unique1 < 10;

SELECT sum(unique1) over (w range between current row and unbounded following),
    unique1, four
FROM tenk1 WHERE unique1 < 10 WINDOW w AS (order by four);

SELECT sum(unique1) over (w range between unbounded preceding and current row exclude current row),
    unique1, four
FROM tenk1 WHERE unique1 < 10 WINDOW w AS (order by four);

SELECT sum(unique1) over (w range between unbounded preceding and current row exclude group),
    unique1, four
FROM tenk1 WHERE unique1 < 10 WINDOW w AS (order by four);

SELECT sum(unique1) over (w range between unbounded preceding and current row exclude ties),
    unique1, four
FROM tenk1 WHERE unique1 < 10 WINDOW w AS (order by four);

SELECT first_value(unique1) ignore nulls over w,
    nth_value(unique1, 2) ignore nulls over w AS nth_2,
    last_value(unique1) ignore nulls over w, unique1, four
FROM tenk1 WHERE unique1 < 10
WINDOW w AS (order by four range between current row and unbounded following);

SELECT sum(unique1) over
    (order by unique1
     rows (SELECT unique1 FROM tenk1 ORDER BY unique1 LIMIT 1) + 1 PRECEDING),
    unique1
FROM tenk1 WHERE unique1 < 10;

CREATE TEMP VIEW v_window AS
    SELECT i, sum(i) over (order by i rows between 1 preceding and 1 following) as sum_rows
    FROM generate_series(1, 10) i;

SELECT * FROM v_window;

SELECT pg_get_viewdef('v_window');

CREATE OR REPLACE TEMP VIEW v_window AS
    SELECT i, sum(i) over (order by i rows between 1 preceding and 1 following
    exclude current row) as sum_rows FROM generate_series(1, 10) i;

SELECT * FROM v_window;

SELECT pg_get_viewdef('v_window');

CREATE OR REPLACE TEMP VIEW v_window AS
    SELECT i, sum(i) over (order by i rows between 1 preceding and 1 following
    exclude group) as sum_rows FROM generate_series(1, 10) i;

SELECT * FROM v_window;

SELECT pg_get_viewdef('v_window');

CREATE OR REPLACE TEMP VIEW v_window AS
    SELECT i, sum(i) over (order by i rows between 1 preceding and 1 following
    exclude ties) as sum_rows FROM generate_series(1, 10) i;

SELECT * FROM v_window;

SELECT pg_get_viewdef('v_window');

CREATE OR REPLACE TEMP VIEW v_window AS
    SELECT i, sum(i) over (order by i rows between 1 preceding and 1 following
    exclude no others) as sum_rows FROM generate_series(1, 10) i;

SELECT * FROM v_window;

SELECT pg_get_viewdef('v_window');

CREATE OR REPLACE TEMP VIEW v_window AS
    SELECT i, sum(i) over (order by i groups between 1 preceding and 1 following) as sum_rows FROM generate_series(1,
10)i;
 

SELECT * FROM v_window;

SELECT pg_get_viewdef('v_window');

DROP VIEW v_window;

CREATE TEMP VIEW v_window AS
    SELECT i, min(i) over (order by i range between '1 day' preceding and '10 days' following) as min_i
  FROM generate_series(now(), now()+'100 days'::interval, '1 hour') i;

SELECT pg_get_viewdef('v_window');

-- RANGE offset PRECEDING/FOLLOWING tests

SELECT sum(unique1) over (order by four range between 2::int8 preceding and 1::int2 preceding),
    unique1, four
FROM tenk1 WHERE unique1 < 10;

SELECT sum(unique1) over (order by four desc range between 2::int8 preceding and 1::int2 preceding),
    unique1, four
FROM tenk1 WHERE unique1 < 10;

SELECT sum(unique1) over (order by four range between 2::int8 preceding and 1::int2 preceding exclude no others),
    unique1, four
FROM tenk1 WHERE unique1 < 10;

SELECT sum(unique1) over (order by four range between 2::int8 preceding and 1::int2 preceding exclude current row),
    unique1, four
FROM tenk1 WHERE unique1 < 10;

SELECT sum(unique1) over (order by four range between 2::int8 preceding and 1::int2 preceding exclude group),
    unique1, four
FROM tenk1 WHERE unique1 < 10;

SELECT sum(unique1) over (order by four range between 2::int8 preceding and 1::int2 preceding exclude ties),
    unique1, four
FROM tenk1 WHERE unique1 < 10;

SELECT sum(unique1) over (order by four range between 2::int8 preceding and 6::int2 following exclude ties),
    unique1, four
FROM tenk1 WHERE unique1 < 10;

SELECT sum(unique1) over (order by four range between 2::int8 preceding and 6::int2 following exclude group),
    unique1, four
FROM tenk1 WHERE unique1 < 10;

SELECT sum(unique1) over (partition by four order by unique1 range between 5::int8 preceding and 6::int2 following),
    unique1, four
FROM tenk1 WHERE unique1 < 10;

SELECT sum(unique1) over (partition by four order by unique1 range between 5::int8 preceding and 6::int2 following
    exclude current row),unique1, four
FROM tenk1 WHERE unique1 < 10;

select sum(salary) over (order by enroll_date range between '1 year'::interval preceding and '1 year'::interval
following),
    salary, enroll_date from empsalary;

select sum(salary) over (order by enroll_date desc range between '1 year'::interval preceding and '1 year'::interval
following),
    salary, enroll_date from empsalary;

select sum(salary) over (order by enroll_date desc range between '1 year'::interval following and '1 year'::interval
following),
    salary, enroll_date from empsalary;

select sum(salary) over (order by enroll_date range between '1 year'::interval preceding and '1 year'::interval
following
    exclude current row), salary, enroll_date from empsalary;

select sum(salary) over (order by enroll_date range between '1 year'::interval preceding and '1 year'::interval
following
    exclude group), salary, enroll_date from empsalary;

select sum(salary) over (order by enroll_date range between '1 year'::interval preceding and '1 year'::interval
following
    exclude ties), salary, enroll_date from empsalary;

select first_value(salary) ignore nulls over(order by salary range between 1000 preceding and 1000 following),
    lead(salary) ignore nulls over(order by salary range between 1000 preceding and 1000 following),
    nth_value(salary, 1) ignore nulls over(order by salary range between 1000 preceding and 1000 following),
    salary from empsalary;

select last_value(salary) ignore nulls over(order by salary range between 1000 preceding and 1000 following),
    lag(salary) ignore nulls over(order by salary range between 1000 preceding and 1000 following),
    salary from empsalary;

select first_value(salary) ignore nulls over(order by salary range between 1000 following and 3000 following
    exclude current row),
    lead(salary) ignore nulls over(order by salary range between 1000 following and 3000 following exclude ties),
    nth_value(salary, 1) ignore nulls over(order by salary range between 1000 following and 3000 following
    exclude ties),
    salary from empsalary;

select last_value(salary) ignore nulls over(order by salary range between 1000 following and 3000 following
    exclude group),
    lag(salary) ignore nulls over(order by salary range between 1000 following and 3000 following exclude group),
    salary from empsalary;

select first_value(salary) ignore nulls over(order by enroll_date range between unbounded preceding and '1
year'::intervalfollowing
 
    exclude ties),
    last_value(salary) ignore nulls over(order by enroll_date range between unbounded preceding and '1 year'::interval
following),
    salary, enroll_date from empsalary;

select first_value(salary) ignore nulls over(order by enroll_date range between unbounded preceding and '1
year'::intervalfollowing
 
    exclude ties),
    last_value(salary) ignore nulls over(order by enroll_date range between unbounded preceding and '1 year'::interval
following
    exclude ties),
    salary, enroll_date from empsalary;

select first_value(salary) ignore nulls over(order by enroll_date range between unbounded preceding and '1
year'::intervalfollowing
 
    exclude group),
    last_value(salary) ignore nulls over(order by enroll_date range between unbounded preceding and '1 year'::interval
following
    exclude group),
    salary, enroll_date from empsalary;

select first_value(salary) ignore nulls over(order by enroll_date range between unbounded preceding and '1
year'::intervalfollowing
 
    exclude current row),
    last_value(salary) ignore nulls over(order by enroll_date range between unbounded preceding and '1 year'::interval
following
    exclude current row),
    salary, enroll_date from empsalary;

-- RANGE offset PRECEDING/FOLLOWING with null values
select x, y,
       first_value(y) ignore nulls over w,
       last_value(y) ignore nulls over w
from
  (select x, x as y from generate_series(1,5) as x
   union all select null, 42
   union all select null, 43) ss
window w as
  (order by x asc nulls first range between 2 preceding and 2 following);

select x, y,
       first_value(y) ignore nulls over w,
       last_value(y) ignore nulls over w
from
  (select x, x as y from generate_series(1,5) as x
   union all select null, 42
   union all select null, 43) ss
window w as
  (order by x asc nulls last range between 2 preceding and 2 following);

select x, y,
       first_value(y) ignore nulls over w,
       last_value(y) ignore nulls over w
from
  (select x, x as y from generate_series(1,5) as x
   union all select null, 42
   union all select null, 43) ss
window w as
  (order by x desc nulls first range between 2 preceding and 2 following);

select x, y,
       first_value(y) ignore nulls over w,
       last_value(y) ignore nulls over w
from
  (select x, x as y from generate_series(1,5) as x
   union all select null, 42
   union all select null, 43) ss
window w as
  (order by x desc nulls last range between 2 preceding and 2 following);

-- There is a syntactic ambiguity in the SQL standard.  Since
-- UNBOUNDED is a non-reserved word, it could be the name of a
-- function parameter and be used as an expression.  There is a
-- grammar hack to resolve such cases as the keyword.  The following
-- tests record this behavior.

CREATE FUNCTION unbounded_syntax_test1a(x int) RETURNS TABLE (a int, b int, c int)
LANGUAGE SQL
BEGIN ATOMIC
  SELECT sum(unique1) over (rows between x preceding and x following),
         unique1, four
  FROM tenk1 WHERE unique1 < 10;
END;

CREATE FUNCTION unbounded_syntax_test1b(x int) RETURNS TABLE (a int, b int, c int)
LANGUAGE SQL
AS $$
  SELECT sum(unique1) over (rows between x preceding and x following),
         unique1, four
  FROM tenk1 WHERE unique1 < 10;
$$;

-- These will apply the argument to the window specification inside the function.
SELECT * FROM unbounded_syntax_test1a(2);
SELECT * FROM unbounded_syntax_test1b(2);

CREATE FUNCTION unbounded_syntax_test2a(unbounded int) RETURNS TABLE (a int, b int, c int)
LANGUAGE SQL
BEGIN ATOMIC
  SELECT sum(unique1) over (rows between unbounded preceding and unbounded following),
         unique1, four
  FROM tenk1 WHERE unique1 < 10;
END;

CREATE FUNCTION unbounded_syntax_test2b(unbounded int) RETURNS TABLE (a int, b int, c int)
LANGUAGE SQL
AS $$
  SELECT sum(unique1) over (rows between unbounded preceding and unbounded following),
         unique1, four
  FROM tenk1 WHERE unique1 < 10;
$$;

-- These will not apply the argument but instead treat UNBOUNDED as a keyword.
SELECT * FROM unbounded_syntax_test2a(2);
SELECT * FROM unbounded_syntax_test2b(2);

DROP FUNCTION unbounded_syntax_test1a, unbounded_syntax_test1b,
              unbounded_syntax_test2a, unbounded_syntax_test2b;

-- Other tests with token UNBOUNDED in potentially problematic position
CREATE FUNCTION unbounded(x int) RETURNS int LANGUAGE SQL IMMUTABLE RETURN x;

SELECT sum(unique1) over (rows between 1 preceding and 1 following),
       unique1, four
FROM tenk1 WHERE unique1 < 10;

SELECT sum(unique1) over (rows between unbounded(1) preceding and unbounded(1) following),
       unique1, four
FROM tenk1 WHERE unique1 < 10;

SELECT sum(unique1) over (rows between unbounded.x preceding and unbounded.x following),
       unique1, four
FROM tenk1, (values (1)) as unbounded(x) WHERE unique1 < 10;

DROP FUNCTION unbounded;

-- Check overflow behavior for various integer sizes

select x, last_value(x) ignore nulls over (order by x::smallint range between current row and 2147450884 following)
from generate_series(32764, 32766) x;

select x, last_value(x) ignore nulls over (order by x::smallint desc range between current row and 2147450885
following)
from generate_series(-32766, -32764) x;

select x, last_value(x) ignore nulls over (order by x range between current row and 4 following)
from generate_series(2147483644, 2147483646) x;

select x, last_value(x) ignore nulls over (order by x desc range between current row and 5 following)
from generate_series(-2147483646, -2147483644) x;

select x, last_value(x) ignore nulls over (order by x range between current row and 4 following)
from generate_series(9223372036854775804, 9223372036854775806) x;

select x, last_value(x) ignore nulls over (order by x desc range between current row and 5 following)
from generate_series(-9223372036854775806, -9223372036854775804) x;

-- Test in_range for other numeric datatypes

create temp table numerics(
    id int,
    f_float4 float4,
    f_float8 float8,
    f_numeric numeric
);

insert into numerics values
(0, '-infinity', '-infinity', '-infinity'),
(1, -3, -3, -3),
(2, -1, -1, -1),
(3, 0, 0, 0),
(4, 1.1, 1.1, 1.1),
(5, 1.12, 1.12, 1.12),
(6, 2, 2, 2),
(7, 100, 100, 100),
(8, 'infinity', 'infinity', 'infinity'),
(9, 'NaN', 'NaN', 'NaN');

select id, f_float4, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from numerics
window w as (order by f_float4 range between
             1 preceding and 1 following);
select id, f_float4, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from numerics
window w as (order by f_float4 range between
             1 preceding and 1.1::float4 following);
select id, f_float4, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from numerics
window w as (order by f_float4 range between
             'inf' preceding and 'inf' following);
select id, f_float4, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from numerics
window w as (order by f_float4 range between
             'inf' preceding and 'inf' preceding);
select id, f_float4, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from numerics
window w as (order by f_float4 range between
             'inf' following and 'inf' following);
select id, f_float4, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from numerics
window w as (order by f_float4 range between
             1.1 preceding and 'NaN' following);  -- error, NaN disallowed

select id, f_float8, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from numerics
window w as (order by f_float8 range between
             1 preceding and 1 following);
select id, f_float8, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from numerics
window w as (order by f_float8 range between
             1 preceding and 1.1::float8 following);
select id, f_float8, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from numerics
window w as (order by f_float8 range between
             'inf' preceding and 'inf' following);
select id, f_float8, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from numerics
window w as (order by f_float8 range between
             'inf' preceding and 'inf' preceding);
select id, f_float8, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from numerics
window w as (order by f_float8 range between
             'inf' following and 'inf' following);
select id, f_float8, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from numerics
window w as (order by f_float8 range between
             1.1 preceding and 'NaN' following);  -- error, NaN disallowed

select id, f_numeric, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from numerics
window w as (order by f_numeric range between
             1 preceding and 1 following);
select id, f_numeric, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from numerics
window w as (order by f_numeric range between
             1 preceding and 1.1::numeric following);
select id, f_numeric, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from numerics
window w as (order by f_numeric range between
             1 preceding and 1.1::float8 following);  -- currently unsupported
select id, f_numeric, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from numerics
window w as (order by f_numeric range between
             'inf' preceding and 'inf' following);
select id, f_numeric, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from numerics
window w as (order by f_numeric range between
             'inf' preceding and 'inf' preceding);
select id, f_numeric, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from numerics
window w as (order by f_numeric range between
             'inf' following and 'inf' following);
select id, f_numeric, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from numerics
window w as (order by f_numeric range between
             1.1 preceding and 'NaN' following);  -- error, NaN disallowed

-- Test in_range for other datetime datatypes

create temp table datetimes(
    id int,
    f_time time,
    f_timetz timetz,
    f_interval interval,
    f_timestamptz timestamptz,
    f_timestamp timestamp
);

insert into datetimes values
(0, '10:00', '10:00 BST', '-infinity', '-infinity', '-infinity'),
(1, '11:00', '11:00 BST', '1 year', '2000-10-19 10:23:54+01', '2000-10-19 10:23:54'),
(2, '12:00', '12:00 BST', '2 years', '2001-10-19 10:23:54+01', '2001-10-19 10:23:54'),
(3, '13:00', '13:00 BST', '3 years', '2001-10-19 10:23:54+01', '2001-10-19 10:23:54'),
(4, '14:00', '14:00 BST', '4 years', '2002-10-19 10:23:54+01', '2002-10-19 10:23:54'),
(5, '15:00', '15:00 BST', '5 years', '2003-10-19 10:23:54+01', '2003-10-19 10:23:54'),
(6, '15:00', '15:00 BST', '5 years', '2004-10-19 10:23:54+01', '2004-10-19 10:23:54'),
(7, '17:00', '17:00 BST', '7 years', '2005-10-19 10:23:54+01', '2005-10-19 10:23:54'),
(8, '18:00', '18:00 BST', '8 years', '2006-10-19 10:23:54+01', '2006-10-19 10:23:54'),
(9, '19:00', '19:00 BST', '9 years', '2007-10-19 10:23:54+01', '2007-10-19 10:23:54'),
(10, '20:00', '20:00 BST', '10 years', '2008-10-19 10:23:54+01', '2008-10-19 10:23:54'),
(11, '21:00', '21:00 BST', 'infinity', 'infinity', 'infinity');

select id, f_time, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from datetimes
window w as (order by f_time range between
             '70 min'::interval preceding and '2 hours'::interval following);

select id, f_time, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from datetimes
window w as (order by f_time desc range between
             '70 min' preceding and '2 hours' following);

select id, f_time, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from datetimes
window w as (order by f_time desc range between
             '-70 min' preceding and '2 hours' following);  -- error, negative offset disallowed

select id, f_time, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from datetimes
window w as (order by f_time range between
             'infinity'::interval preceding and 'infinity'::interval following);

select id, f_time, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from datetimes
window w as (order by f_time range between
             'infinity'::interval preceding and 'infinity'::interval preceding);

select id, f_time, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from datetimes
window w as (order by f_time range between
             'infinity'::interval following and 'infinity'::interval following);

select id, f_time, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from datetimes
window w as (order by f_time range between
             '-infinity'::interval following and
             'infinity'::interval following);  -- error, negative offset disallowed

select id, f_timetz, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from datetimes
window w as (order by f_timetz range between
             '70 min'::interval preceding and '2 hours'::interval following);

select id, f_timetz, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from datetimes
window w as (order by f_timetz desc range between
             '70 min' preceding and '2 hours' following);

select id, f_timetz, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from datetimes
window w as (order by f_timetz desc range between
             '70 min' preceding and '-2 hours' following);  -- error, negative offset disallowed

select id, f_timetz, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from datetimes
window w as (order by f_timetz range between
             'infinity'::interval preceding and 'infinity'::interval following);

select id, f_timetz, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from datetimes
window w as (order by f_timetz range between
             'infinity'::interval preceding and 'infinity'::interval preceding);

select id, f_timetz, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from datetimes
window w as (order by f_timetz range between
             'infinity'::interval following and 'infinity'::interval following);

select id, f_timetz, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from datetimes
window w as (order by f_timetz range between
             'infinity'::interval following and
             '-infinity'::interval following);  -- error, negative offset disallowed

select id, f_interval, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from datetimes
window w as (order by f_interval range between
             '1 year'::interval preceding and '1 year'::interval following);

select id, f_interval, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from datetimes
window w as (order by f_interval desc range between
             '1 year' preceding and '1 year' following);

select id, f_interval, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from datetimes
window w as (order by f_interval desc range between
             '-1 year' preceding and '1 year' following);  -- error, negative offset disallowed

select id, f_interval, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from datetimes
window w as (order by f_interval range between
             'infinity'::interval preceding and 'infinity'::interval following);

select id, f_interval, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from datetimes
window w as (order by f_interval range between
             'infinity'::interval preceding and 'infinity'::interval preceding);

select id, f_interval, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from datetimes
window w as (order by f_interval range between
             'infinity'::interval following and 'infinity'::interval following);

select id, f_interval, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from datetimes
window w as (order by f_interval range between
             '-infinity'::interval following and
             'infinity'::interval following);  -- error, negative offset disallowed

select id, f_timestamptz, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from datetimes
window w as (order by f_timestamptz range between
             '1 year'::interval preceding and '1 year'::interval following);

select id, f_timestamptz, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from datetimes
window w as (order by f_timestamptz desc range between
             '1 year' preceding and '1 year' following);

select id, f_timestamptz, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from datetimes
window w as (order by f_timestamptz desc range between
             '1 year' preceding and '-1 year' following);  -- error, negative offset disallowed

select id, f_timestamptz, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from datetimes
window w as (order by f_timestamptz range between
             'infinity'::interval preceding and 'infinity'::interval following);

select id, f_timestamptz, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from datetimes
window w as (order by f_timestamptz range between
             'infinity'::interval preceding and 'infinity'::interval preceding);

select id, f_timestamptz, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from datetimes
window w as (order by f_timestamptz range between
             'infinity'::interval following and 'infinity'::interval following);

select id, f_timestamptz, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from datetimes
window w as (order by f_timestamptz range between
             '-infinity'::interval following and
             'infinity'::interval following);  -- error, negative offset disallowed

select id, f_timestamp, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from datetimes
window w as (order by f_timestamp range between
             '1 year'::interval preceding and '1 year'::interval following);

select id, f_timestamp, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from datetimes
window w as (order by f_timestamp desc range between
             '1 year' preceding and '1 year' following);

select id, f_timestamp, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from datetimes
window w as (order by f_timestamp desc range between
             '-1 year' preceding and '1 year' following);  -- error, negative offset disallowed

select id, f_timestamp, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from datetimes
window w as (order by f_timestamp range between
             'infinity'::interval preceding and 'infinity'::interval following);

select id, f_timestamp, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from datetimes
window w as (order by f_timestamp range between
             'infinity'::interval preceding and 'infinity'::interval preceding);

select id, f_timestamp, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from datetimes
window w as (order by f_timestamp range between
             'infinity'::interval following and 'infinity'::interval following);

select id, f_timestamp, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
from datetimes
window w as (order by f_timestamp range between
             '-infinity'::interval following and
             'infinity'::interval following);  -- error, negative offset disallowed

-- RANGE offset PRECEDING/FOLLOWING error cases
select sum(salary) over (order by enroll_date, salary range between '1 year'::interval preceding and '2
years'::intervalfollowing
 
    exclude ties), salary, enroll_date from empsalary;

select sum(salary) over (range between '1 year'::interval preceding and '2 years'::interval following
    exclude ties), salary, enroll_date from empsalary;

select sum(salary) over (order by depname range between '1 year'::interval preceding and '2 years'::interval following
    exclude ties), salary, enroll_date from empsalary;

select max(enroll_date) over (order by enroll_date range between 1 preceding and 2 following
    exclude ties), salary, enroll_date from empsalary;

select max(enroll_date) over (order by salary range between -1 preceding and 2 following
    exclude ties), salary, enroll_date from empsalary;

select max(enroll_date) over (order by salary range between 1 preceding and -2 following
    exclude ties), salary, enroll_date from empsalary;

select max(enroll_date) over (order by salary range between '1 year'::interval preceding and '2 years'::interval
following
    exclude ties), salary, enroll_date from empsalary;

select max(enroll_date) over (order by enroll_date range between '1 year'::interval preceding and '-2 years'::interval
following
    exclude ties), salary, enroll_date from empsalary;

-- GROUPS tests

SELECT sum(unique1) over (order by four groups between unbounded preceding and current row),
    unique1, four
FROM tenk1 WHERE unique1 < 10;

SELECT sum(unique1) over (order by four groups between unbounded preceding and unbounded following),
    unique1, four
FROM tenk1 WHERE unique1 < 10;

SELECT sum(unique1) over (order by four groups between current row and unbounded following),
    unique1, four
FROM tenk1 WHERE unique1 < 10;

SELECT sum(unique1) over (order by four groups between 1 preceding and unbounded following),
    unique1, four
FROM tenk1 WHERE unique1 < 10;

SELECT sum(unique1) over (order by four groups between 1 following and unbounded following),
    unique1, four
FROM tenk1 WHERE unique1 < 10;

SELECT sum(unique1) over (order by four groups between unbounded preceding and 2 following),
    unique1, four
FROM tenk1 WHERE unique1 < 10;

SELECT sum(unique1) over (order by four groups between 2 preceding and 1 preceding),
    unique1, four
FROM tenk1 WHERE unique1 < 10;

SELECT sum(unique1) over (order by four groups between 2 preceding and 1 following),
    unique1, four
FROM tenk1 WHERE unique1 < 10;

SELECT sum(unique1) over (order by four groups between 0 preceding and 0 following),
    unique1, four
FROM tenk1 WHERE unique1 < 10;

SELECT sum(unique1) over (order by four groups between 2 preceding and 1 following
    exclude current row), unique1, four
FROM tenk1 WHERE unique1 < 10;

SELECT sum(unique1) over (order by four groups between 2 preceding and 1 following
    exclude group), unique1, four
FROM tenk1 WHERE unique1 < 10;

SELECT sum(unique1) over (order by four groups between 2 preceding and 1 following
    exclude ties), unique1, four
FROM tenk1 WHERE unique1 < 10;

SELECT sum(unique1) over (partition by ten
    order by four groups between 0 preceding and 0 following),unique1, four, ten
FROM tenk1 WHERE unique1 < 10;

SELECT sum(unique1) over (partition by ten
    order by four groups between 0 preceding and 0 following exclude current row), unique1, four, ten
FROM tenk1 WHERE unique1 < 10;

SELECT sum(unique1) over (partition by ten
    order by four groups between 0 preceding and 0 following exclude group), unique1, four, ten
FROM tenk1 WHERE unique1 < 10;

SELECT sum(unique1) over (partition by ten
    order by four groups between 0 preceding and 0 following exclude ties), unique1, four, ten
FROM tenk1 WHERE unique1 < 10;

select first_value(salary) ignore nulls over(order by enroll_date groups between 1 preceding and 1 following),
    lead(salary) ignore nulls over(order by enroll_date groups between 1 preceding and 1 following),
    nth_value(salary, 1) ignore nulls over(order by enroll_date groups between 1 preceding and 1 following),
    salary, enroll_date from empsalary;

select last_value(salary) ignore nulls over(order by enroll_date groups between 1 preceding and 1 following),
    lag(salary) ignore nulls over(order by enroll_date groups between 1 preceding and 1 following),
    salary, enroll_date from empsalary;

select first_value(salary) ignore nulls over(order by enroll_date groups between 1 following and 3 following
    exclude current row),
    lead(salary) ignore nulls over(order by enroll_date groups between 1 following and 3 following exclude ties),
    nth_value(salary, 1) ignore nulls over(order by enroll_date groups between 1 following and 3 following
    exclude ties),
    salary, enroll_date from empsalary;

select last_value(salary) ignore nulls over(order by enroll_date groups between 1 following and 3 following
    exclude group),
    lag(salary) ignore nulls over(order by enroll_date groups between 1 following and 3 following exclude group),
    salary, enroll_date from empsalary;

-- Show differences in offset interpretation between ROWS, RANGE, and GROUPS
WITH cte (x) AS (
        SELECT * FROM generate_series(1, 35, 2)
)
SELECT x, (sum(x) over w)
FROM cte
WINDOW w AS (ORDER BY x rows between 1 preceding and 1 following);

WITH cte (x) AS (
        SELECT * FROM generate_series(1, 35, 2)
)
SELECT x, (sum(x) over w)
FROM cte
WINDOW w AS (ORDER BY x range between 1 preceding and 1 following);

WITH cte (x) AS (
        SELECT * FROM generate_series(1, 35, 2)
)
SELECT x, (sum(x) over w)
FROM cte
WINDOW w AS (ORDER BY x groups between 1 preceding and 1 following);

WITH cte (x) AS (
        select 1 union all select 1 union all select 1 union all
        SELECT * FROM generate_series(5, 49, 2)
)
SELECT x, (sum(x) over w)
FROM cte
WINDOW w AS (ORDER BY x rows between 1 preceding and 1 following);

WITH cte (x) AS (
        select 1 union all select 1 union all select 1 union all
        SELECT * FROM generate_series(5, 49, 2)
)
SELECT x, (sum(x) over w)
FROM cte
WINDOW w AS (ORDER BY x range between 1 preceding and 1 following);

WITH cte (x) AS (
        select 1 union all select 1 union all select 1 union all
        SELECT * FROM generate_series(5, 49, 2)
)
SELECT x, (sum(x) over w)
FROM cte
WINDOW w AS (ORDER BY x groups between 1 preceding and 1 following);

-- with UNION
SELECT count(*) OVER (PARTITION BY four) FROM (SELECT * FROM tenk1 UNION ALL SELECT * FROM tenk2)s LIMIT 0;

-- check some degenerate cases
create temp table t1 (f1 int, f2 int8);
insert into t1 values (1,1),(1,2),(2,2);

select f1, sum(f1) over (partition by f1
                         range between 1 preceding and 1 following)
from t1 where f1 = f2;  -- error, must have order by
explain (costs off)
select f1, sum(f1) over (partition by f1 order by f2
                         range between 1 preceding and 1 following)
from t1 where f1 = f2;
select f1, sum(f1) over (partition by f1 order by f2
                         range between 1 preceding and 1 following)
from t1 where f1 = f2;
select f1, sum(f1) over (partition by f1, f1 order by f2
                         range between 2 preceding and 1 preceding)
from t1 where f1 = f2;
select f1, sum(f1) over (partition by f1, f2 order by f2
                         range between 1 following and 2 following)
from t1 where f1 = f2;

select f1, sum(f1) over (partition by f1
                         groups between 1 preceding and 1 following)
from t1 where f1 = f2;  -- error, must have order by
explain (costs off)
select f1, sum(f1) over (partition by f1 order by f2
                         groups between 1 preceding and 1 following)
from t1 where f1 = f2;
select f1, sum(f1) over (partition by f1 order by f2
                         groups between 1 preceding and 1 following)
from t1 where f1 = f2;
select f1, sum(f1) over (partition by f1, f1 order by f2
                         groups between 2 preceding and 1 preceding)
from t1 where f1 = f2;
select f1, sum(f1) over (partition by f1, f2 order by f2
                         groups between 1 following and 2 following)
from t1 where f1 = f2;

-- ordering by a non-integer constant is allowed
SELECT rank() OVER (ORDER BY length('abc'));

-- can't order by another window function
SELECT rank() OVER (ORDER BY rank() OVER (ORDER BY random()));

-- some other errors
SELECT * FROM empsalary WHERE row_number() OVER (ORDER BY salary) < 10;

SELECT * FROM empsalary INNER JOIN tenk1 ON row_number() OVER (ORDER BY salary) < 10;

SELECT rank() OVER (ORDER BY 1), count(*) FROM empsalary GROUP BY 1;

SELECT * FROM rank() OVER (ORDER BY random());

DELETE FROM empsalary WHERE (rank() OVER (ORDER BY random())) > 10;

DELETE FROM empsalary RETURNING rank() OVER (ORDER BY random());

SELECT count(*) OVER w FROM tenk1 WINDOW w AS (ORDER BY unique1), w AS (ORDER BY unique1);

SELECT rank() OVER (PARTITION BY four, ORDER BY ten) FROM tenk1;

SELECT count() OVER () FROM tenk1;

SELECT generate_series(1, 100) OVER () FROM empsalary;

SELECT ntile(0) OVER (ORDER BY ten), ten, four FROM tenk1;

SELECT nth_value(four, 0) OVER (ORDER BY ten), ten, four FROM tenk1;

-- filter

SELECT sum(salary), row_number() OVER (ORDER BY depname), sum(
    sum(salary) FILTER (WHERE enroll_date > '2007-01-01')
) FILTER (WHERE depname <> 'sales') OVER (ORDER BY depname DESC) AS "filtered_sum",
    depname
FROM empsalary GROUP BY depname;

--
-- Test SupportRequestOptimizeWindowClause's ability to de-duplicate
-- WindowClauses
--

-- Ensure WindowClause frameOptions are changed so that only a single
-- WindowAgg exists in the plan.
EXPLAIN (COSTS OFF)
SELECT
    empno,
    depname,
    row_number() OVER (PARTITION BY depname ORDER BY enroll_date) rn,
    rank() OVER (PARTITION BY depname ORDER BY enroll_date ROWS BETWEEN
                 UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) rnk,
    dense_rank() OVER (PARTITION BY depname ORDER BY enroll_date RANGE BETWEEN
                       CURRENT ROW AND CURRENT ROW) drnk,
    ntile(10) OVER (PARTITION BY depname ORDER BY enroll_date RANGE BETWEEN
                    CURRENT ROW AND UNBOUNDED FOLLOWING) nt,
    percent_rank() OVER (PARTITION BY depname ORDER BY enroll_date ROWS BETWEEN
                         CURRENT ROW AND UNBOUNDED FOLLOWING) pr,
    cume_dist() OVER (PARTITION BY depname ORDER BY enroll_date RANGE BETWEEN
                      CURRENT ROW AND UNBOUNDED FOLLOWING) cd
FROM empsalary;

-- Ensure WindowFuncs which cannot support their WindowClause's frameOptions
-- being changed are untouched
EXPLAIN (COSTS OFF, VERBOSE)
SELECT
    empno,
    depname,
    row_number() OVER (PARTITION BY depname ORDER BY enroll_date) rn,
    rank() OVER (PARTITION BY depname ORDER BY enroll_date ROWS BETWEEN
                 UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) rnk,
    count(*) OVER (PARTITION BY depname ORDER BY enroll_date RANGE BETWEEN
                   CURRENT ROW AND CURRENT ROW) cnt
FROM empsalary;

-- Ensure the above query gives us the expected results
SELECT
    empno,
    depname,
    row_number() OVER (PARTITION BY depname ORDER BY enroll_date) rn,
    rank() OVER (PARTITION BY depname ORDER BY enroll_date ROWS BETWEEN
                 UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) rnk,
    count(*) OVER (PARTITION BY depname ORDER BY enroll_date RANGE BETWEEN
                   CURRENT ROW AND CURRENT ROW) cnt
FROM empsalary;

-- Test pushdown of quals into a subquery containing window functions

-- pushdown is safe because all PARTITION BY clauses include depname:
EXPLAIN (COSTS OFF)
SELECT * FROM
  (SELECT depname,
          sum(salary) OVER (PARTITION BY depname) depsalary,
          min(salary) OVER (PARTITION BY depname || 'A', depname) depminsalary
   FROM empsalary) emp
WHERE depname = 'sales';

-- pushdown is unsafe because there's a PARTITION BY clause without depname:
EXPLAIN (COSTS OFF)
SELECT * FROM
  (SELECT depname,
          sum(salary) OVER (PARTITION BY enroll_date) enroll_salary,
          min(salary) OVER (PARTITION BY depname) depminsalary
   FROM empsalary) emp
WHERE depname = 'sales';

-- Test window function run conditions are properly pushed down into the
-- WindowAgg
EXPLAIN (COSTS OFF)
SELECT * FROM
  (SELECT empno,
          row_number() OVER (ORDER BY empno) rn
   FROM empsalary) emp
WHERE rn < 3;

-- The following 3 statements should result the same result.
SELECT * FROM
  (SELECT empno,
          row_number() OVER (ORDER BY empno) rn
   FROM empsalary) emp
WHERE rn < 3;

SELECT * FROM
  (SELECT empno,
          row_number() OVER (ORDER BY empno) rn
   FROM empsalary) emp
WHERE 3 > rn;

SELECT * FROM
  (SELECT empno,
          row_number() OVER (ORDER BY empno) rn
   FROM empsalary) emp
WHERE 2 >= rn;

-- Ensure r <= 3 is pushed down into the run condition of the window agg
EXPLAIN (COSTS OFF)
SELECT * FROM
  (SELECT empno,
          salary,
          rank() OVER (ORDER BY salary DESC) r
   FROM empsalary) emp
WHERE r <= 3;

SELECT * FROM
  (SELECT empno,
          salary,
          rank() OVER (ORDER BY salary DESC) r
   FROM empsalary) emp
WHERE r <= 3;

-- Ensure dr = 1 is converted to dr <= 1 to get all rows leading up to dr = 1
EXPLAIN (COSTS OFF)
SELECT * FROM
  (SELECT empno,
          salary,
          dense_rank() OVER (ORDER BY salary DESC) dr
   FROM empsalary) emp
WHERE dr = 1;

SELECT * FROM
  (SELECT empno,
          salary,
          dense_rank() OVER (ORDER BY salary DESC) dr
   FROM empsalary) emp
WHERE dr = 1;

-- Check COUNT() and COUNT(*)
EXPLAIN (COSTS OFF)
SELECT * FROM
  (SELECT empno,
          salary,
          count(*) OVER (ORDER BY salary DESC) c
   FROM empsalary) emp
WHERE c <= 3;

SELECT * FROM
  (SELECT empno,
          salary,
          count(*) OVER (ORDER BY salary DESC) c
   FROM empsalary) emp
WHERE c <= 3;

EXPLAIN (COSTS OFF)
SELECT * FROM
  (SELECT empno,
          salary,
          count(empno) OVER (ORDER BY salary DESC) c
   FROM empsalary) emp
WHERE c <= 3;

SELECT * FROM
  (SELECT empno,
          salary,
          count(empno) OVER (ORDER BY salary DESC) c
   FROM empsalary) emp
WHERE c <= 3;

EXPLAIN (COSTS OFF)
SELECT * FROM
  (SELECT empno,
          salary,
          count(*) OVER (ORDER BY salary DESC ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) c
   FROM empsalary) emp
WHERE c >= 3;

EXPLAIN (COSTS OFF)
SELECT * FROM
  (SELECT empno,
          salary,
          count(*) OVER () c
   FROM empsalary) emp
WHERE 11 <= c;

EXPLAIN (COSTS OFF)
SELECT * FROM
  (SELECT empno,
          salary,
          count(*) OVER (ORDER BY salary DESC) c,
          dense_rank() OVER (ORDER BY salary DESC) dr
   FROM empsalary) emp
WHERE dr = 1;

-- Ensure we get a run condition when there's a PARTITION BY clause
EXPLAIN (COSTS OFF)
SELECT * FROM
  (SELECT empno,
          depname,
          row_number() OVER (PARTITION BY depname ORDER BY empno) rn
   FROM empsalary) emp
WHERE rn < 3;

-- and ensure we get the correct results from the above plan
SELECT * FROM
  (SELECT empno,
          depname,
          row_number() OVER (PARTITION BY depname ORDER BY empno) rn
   FROM empsalary) emp
WHERE rn < 3;

-- ensure that "unused" subquery columns are not removed when the column only
-- exists in the run condition
EXPLAIN (COSTS OFF)
SELECT empno, depname FROM
  (SELECT empno,
          depname,
          row_number() OVER (PARTITION BY depname ORDER BY empno) rn
   FROM empsalary) emp
WHERE rn < 3;

-- likewise with count(empno) instead of row_number()
EXPLAIN (COSTS OFF)
SELECT * FROM
  (SELECT empno,
          depname,
          salary,
          count(empno) OVER (PARTITION BY depname ORDER BY salary DESC) c
   FROM empsalary) emp
WHERE c <= 3;

-- and again, check the results are what we expect.
SELECT * FROM
  (SELECT empno,
          depname,
          salary,
          count(empno) OVER (PARTITION BY depname ORDER BY salary DESC) c
   FROM empsalary) emp
WHERE c <= 3;

-- Ensure we get the correct run condition when the window function is both
-- monotonically increasing and decreasing.
EXPLAIN (COSTS OFF)
SELECT * FROM
  (SELECT empno,
          depname,
          salary,
          count(empno) OVER () c
   FROM empsalary) emp
WHERE c = 1;

-- Try another case with a WindowFunc with a byref return type
SELECT * FROM
  (SELECT row_number() OVER (PARTITION BY salary) AS rn,
          lead(depname) OVER (PARTITION BY salary) || ' Department' AS n_dep
   FROM empsalary) emp
WHERE rn < 1;

-- Some more complex cases with multiple window clauses
EXPLAIN (COSTS OFF)
SELECT * FROM
  (SELECT *,
          count(salary) OVER (PARTITION BY depname || '') c1, -- w1
          row_number() OVER (PARTITION BY depname) rn, -- w2
          count(*) OVER (PARTITION BY depname) c2, -- w2
          count(*) OVER (PARTITION BY '' || depname) c3, -- w3
          ntile(2) OVER (PARTITION BY depname) nt -- w2
   FROM empsalary
) e WHERE rn <= 1 AND c1 <= 3 AND nt < 2;

-- Ensure we correctly filter out all of the run conditions from each window
SELECT * FROM
  (SELECT *,
          count(salary) OVER (PARTITION BY depname || '') c1, -- w1
          row_number() OVER (PARTITION BY depname) rn, -- w2
          count(*) OVER (PARTITION BY depname) c2, -- w2
          count(*) OVER (PARTITION BY '' || depname) c3, -- w3
          ntile(2) OVER (PARTITION BY depname) nt -- w2
   FROM empsalary
) e WHERE rn <= 1 AND c1 <= 3 AND nt < 2;

-- Ensure we remove references to reduced outer joins as nulling rels in run
-- conditions
EXPLAIN (COSTS OFF)
SELECT 1 FROM
  (SELECT ntile(e2.salary) OVER (PARTITION BY e1.depname) AS c
   FROM empsalary e1 LEFT JOIN empsalary e2 ON TRUE
   WHERE e1.empno = e2.empno) s
WHERE s.c = 1;

-- Ensure the run condition optimization is used in cases where the WindowFunc
-- has a Var from another query level
EXPLAIN (COSTS OFF)
SELECT 1 FROM
  (SELECT ntile(s1.x) OVER () AS c
   FROM (SELECT (SELECT 1) AS x) AS s1) s
WHERE s.c = 1;

-- Tests to ensure we don't push down the run condition when it's not valid to
-- do so.

-- Ensure we don't push down when the frame options show that the window
-- function is not monotonically increasing
EXPLAIN (COSTS OFF)
SELECT * FROM
  (SELECT empno,
          salary,
          count(*) OVER (ORDER BY salary DESC ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING) c
   FROM empsalary) emp
WHERE c <= 3;

-- Ensure we don't push down when the window function's monotonic properties
-- don't match that of the clauses.
EXPLAIN (COSTS OFF)
SELECT * FROM
  (SELECT empno,
          salary,
          count(*) OVER (ORDER BY salary) c
   FROM empsalary) emp
WHERE 3 <= c;

-- Ensure we don't use a run condition when there's a volatile function in the
-- WindowFunc
EXPLAIN (COSTS OFF)
SELECT * FROM
  (SELECT empno,
          salary,
          count(random()) OVER (ORDER BY empno DESC) c
   FROM empsalary) emp
WHERE c = 1;

-- Ensure we don't use a run condition when the WindowFunc contains subplans
EXPLAIN (COSTS OFF)
SELECT * FROM
  (SELECT empno,
          salary,
          count((SELECT 1)) OVER (ORDER BY empno DESC) c
   FROM empsalary) emp
WHERE c = 1;

-- Test Sort node collapsing
EXPLAIN (COSTS OFF)
SELECT * FROM
  (SELECT depname,
          sum(salary) OVER (PARTITION BY depname order by empno) depsalary,
          min(salary) OVER (PARTITION BY depname, empno order by enroll_date) depminsalary
   FROM empsalary) emp
WHERE depname = 'sales';

-- Ensure that the evaluation order of the WindowAggs results in the WindowAgg
-- with the same sort order that's required by the ORDER BY is evaluated last.
EXPLAIN (COSTS OFF)
SELECT empno,
       enroll_date,
       depname,
       sum(salary) OVER (PARTITION BY depname order by empno) depsalary,
       min(salary) OVER (PARTITION BY depname order by enroll_date) depminsalary
FROM empsalary
ORDER BY depname, empno;

-- As above, but with an adjusted ORDER BY to ensure the above plan didn't
-- perform only 2 sorts by accident.
EXPLAIN (COSTS OFF)
SELECT empno,
       enroll_date,
       depname,
       sum(salary) OVER (PARTITION BY depname order by empno) depsalary,
       min(salary) OVER (PARTITION BY depname order by enroll_date) depminsalary
FROM empsalary
ORDER BY depname, enroll_date;

SET enable_hashagg TO off;

-- Ensure we don't get a sort for both DISTINCT and ORDER BY.  We expect the
-- sort for the DISTINCT to provide presorted input for the ORDER BY.
EXPLAIN (COSTS OFF)
SELECT DISTINCT
       empno,
       enroll_date,
       depname,
       sum(salary) OVER (PARTITION BY depname order by empno) depsalary,
       min(salary) OVER (PARTITION BY depname order by enroll_date) depminsalary
FROM empsalary
ORDER BY depname, enroll_date;

-- As above but adjust the ORDER BY clause to help ensure the plan with the
-- minimum amount of sorting wasn't a fluke.
EXPLAIN (COSTS OFF)
SELECT DISTINCT
       empno,
       enroll_date,
       depname,
       sum(salary) OVER (PARTITION BY depname order by empno) depsalary,
       min(salary) OVER (PARTITION BY depname order by enroll_date) depminsalary
FROM empsalary
ORDER BY depname, empno;

RESET enable_hashagg;

-- Test Sort node reordering
EXPLAIN (COSTS OFF)
SELECT
  lead(1) OVER (PARTITION BY depname ORDER BY salary, enroll_date),
  lag(1) OVER (PARTITION BY depname ORDER BY salary,enroll_date,empno)
FROM empsalary;

-- Test incremental sorting
EXPLAIN (COSTS OFF)
SELECT * FROM
  (SELECT depname,
          empno,
          salary,
          enroll_date,
          row_number() OVER (PARTITION BY depname ORDER BY enroll_date) AS first_emp,
          row_number() OVER (PARTITION BY depname ORDER BY enroll_date DESC) AS last_emp
   FROM empsalary) emp
WHERE first_emp = 1 OR last_emp = 1;

SELECT * FROM
  (SELECT depname,
          empno,
          salary,
          enroll_date,
          row_number() OVER (PARTITION BY depname ORDER BY enroll_date) AS first_emp,
          row_number() OVER (PARTITION BY depname ORDER BY enroll_date DESC) AS last_emp
   FROM empsalary) emp
WHERE first_emp = 1 OR last_emp = 1;

-- cleanup
DROP TABLE empsalary;

-- test user-defined window function with named args and default args
CREATE FUNCTION nth_value_def(val anyelement, n integer = 1) RETURNS anyelement
  LANGUAGE internal WINDOW IMMUTABLE STRICT AS 'window_nth_value';

SELECT nth_value_def(n := 2, val := ten) OVER (PARTITION BY four), ten, four
  FROM (SELECT * FROM tenk1 WHERE unique2 < 10 ORDER BY four, ten) s;

SELECT nth_value_def(ten) OVER (PARTITION BY four), ten, four
  FROM (SELECT * FROM tenk1 WHERE unique2 < 10 ORDER BY four, ten) s;

--
-- Test the basic moving-aggregate machinery
--

-- create aggregates that record the series of transform calls (these are
-- intentionally not true inverses)

CREATE FUNCTION logging_sfunc_nonstrict(text, anyelement) RETURNS text AS
$$ SELECT COALESCE($1, '') || '*' || quote_nullable($2) $$
LANGUAGE SQL IMMUTABLE;

CREATE FUNCTION logging_msfunc_nonstrict(text, anyelement) RETURNS text AS
$$ SELECT COALESCE($1, '') || '+' || quote_nullable($2) $$
LANGUAGE SQL IMMUTABLE;

CREATE FUNCTION logging_minvfunc_nonstrict(text, anyelement) RETURNS text AS
$$ SELECT $1 || '-' || quote_nullable($2) $$
LANGUAGE SQL IMMUTABLE;

CREATE AGGREGATE logging_agg_nonstrict (anyelement)
(
    stype = text,
    sfunc = logging_sfunc_nonstrict,
    mstype = text,
    msfunc = logging_msfunc_nonstrict,
    minvfunc = logging_minvfunc_nonstrict
);

CREATE AGGREGATE logging_agg_nonstrict_initcond (anyelement)
(
    stype = text,
    sfunc = logging_sfunc_nonstrict,
    mstype = text,
    msfunc = logging_msfunc_nonstrict,
    minvfunc = logging_minvfunc_nonstrict,
    initcond = 'I',
    minitcond = 'MI'
);

CREATE FUNCTION logging_sfunc_strict(text, anyelement) RETURNS text AS
$$ SELECT $1 || '*' || quote_nullable($2) $$
LANGUAGE SQL STRICT IMMUTABLE;

CREATE FUNCTION logging_msfunc_strict(text, anyelement) RETURNS text AS
$$ SELECT $1 || '+' || quote_nullable($2) $$
LANGUAGE SQL STRICT IMMUTABLE;

CREATE FUNCTION logging_minvfunc_strict(text, anyelement) RETURNS text AS
$$ SELECT $1 || '-' || quote_nullable($2) $$
LANGUAGE SQL STRICT IMMUTABLE;

CREATE AGGREGATE logging_agg_strict (text)
(
    stype = text,
    sfunc = logging_sfunc_strict,
    mstype = text,
    msfunc = logging_msfunc_strict,
    minvfunc = logging_minvfunc_strict
);

CREATE AGGREGATE logging_agg_strict_initcond (anyelement)
(
    stype = text,
    sfunc = logging_sfunc_strict,
    mstype = text,
    msfunc = logging_msfunc_strict,
    minvfunc = logging_minvfunc_strict,
    initcond = 'I',
    minitcond = 'MI'
);

-- test strict and non-strict cases
SELECT
    p::text || ',' || i::text || ':' || COALESCE(v::text, 'NULL') AS row,
    logging_agg_nonstrict(v) over wnd as nstrict,
    logging_agg_nonstrict_initcond(v) over wnd as nstrict_init,
    logging_agg_strict(v::text) over wnd as strict,
    logging_agg_strict_initcond(v) over wnd as strict_init
FROM (VALUES
    (1, 1, NULL),
    (1, 2, 'a'),
    (1, 3, 'b'),
    (1, 4, NULL),
    (1, 5, NULL),
    (1, 6, 'c'),
    (2, 1, NULL),
    (2, 2, 'x'),
    (3, 1, 'z')
) AS t(p, i, v)
WINDOW wnd AS (PARTITION BY P ORDER BY i ROWS BETWEEN 1 PRECEDING AND CURRENT ROW)
ORDER BY p, i;

-- and again, but with filter
SELECT
    p::text || ',' || i::text || ':' ||
        CASE WHEN f THEN COALESCE(v::text, 'NULL') ELSE '-' END as row,
    logging_agg_nonstrict(v) filter(where f) over wnd as nstrict_filt,
    logging_agg_nonstrict_initcond(v) filter(where f) over wnd as nstrict_init_filt,
    logging_agg_strict(v::text) filter(where f) over wnd as strict_filt,
    logging_agg_strict_initcond(v) filter(where f) over wnd as strict_init_filt
FROM (VALUES
    (1, 1, true,  NULL),
    (1, 2, false, 'a'),
    (1, 3, true,  'b'),
    (1, 4, false, NULL),
    (1, 5, false, NULL),
    (1, 6, false, 'c'),
    (2, 1, false, NULL),
    (2, 2, true,  'x'),
    (3, 1, true,  'z')
) AS t(p, i, f, v)
WINDOW wnd AS (PARTITION BY p ORDER BY i ROWS BETWEEN 1 PRECEDING AND CURRENT ROW)
ORDER BY p, i;

-- test that volatile arguments disable moving-aggregate mode
SELECT
    i::text || ':' || COALESCE(v::text, 'NULL') as row,
    logging_agg_strict(v::text)
        over wnd as inverse,
    logging_agg_strict(v::text || CASE WHEN random() < 0 then '?' ELSE '' END)
        over wnd as noinverse
FROM (VALUES
    (1, 'a'),
    (2, 'b'),
    (3, 'c')
) AS t(i, v)
WINDOW wnd AS (ORDER BY i ROWS BETWEEN 1 PRECEDING AND CURRENT ROW)
ORDER BY i;

SELECT
    i::text || ':' || COALESCE(v::text, 'NULL') as row,
    logging_agg_strict(v::text) filter(where true)
        over wnd as inverse,
    logging_agg_strict(v::text) filter(where random() >= 0)
        over wnd as noinverse
FROM (VALUES
    (1, 'a'),
    (2, 'b'),
    (3, 'c')
) AS t(i, v)
WINDOW wnd AS (ORDER BY i ROWS BETWEEN 1 PRECEDING AND CURRENT ROW)
ORDER BY i;

-- test that non-overlapping windows don't use inverse transitions
SELECT
    logging_agg_strict(v::text) OVER wnd
FROM (VALUES
    (1, 'a'),
    (2, 'b'),
    (3, 'c')
) AS t(i, v)
WINDOW wnd AS (ORDER BY i ROWS BETWEEN CURRENT ROW AND CURRENT ROW)
ORDER BY i;

-- test that returning NULL from the inverse transition functions
-- restarts the aggregation from scratch. The second aggregate is supposed
-- to test cases where only some aggregates restart, the third one checks
-- that one aggregate restarting doesn't cause others to restart.

CREATE FUNCTION sum_int_randrestart_minvfunc(int4, int4) RETURNS int4 AS
$$ SELECT CASE WHEN random() < 0.2 THEN NULL ELSE $1 - $2 END $$
LANGUAGE SQL STRICT;

CREATE AGGREGATE sum_int_randomrestart (int4)
(
    stype = int4,
    sfunc = int4pl,
    mstype = int4,
    msfunc = int4pl,
    minvfunc = sum_int_randrestart_minvfunc
);

WITH
vs AS (
    SELECT i, (random() * 100)::int4 AS v
    FROM generate_series(1, 100) AS i
),
sum_following AS (
    SELECT i, SUM(v) OVER
        (ORDER BY i DESC ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS s
    FROM vs
)
SELECT DISTINCT
    sum_following.s = sum_int_randomrestart(v) OVER fwd AS eq1,
    -sum_following.s = sum_int_randomrestart(-v) OVER fwd AS eq2,
    100*3+(vs.i-1)*3 = length(logging_agg_nonstrict(''::text) OVER fwd) AS eq3
FROM vs
JOIN sum_following ON sum_following.i = vs.i
WINDOW fwd AS (
    ORDER BY vs.i ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
);

--
-- Test various built-in aggregates that have moving-aggregate support
--

-- test inverse transition functions handle NULLs properly
SELECT i,AVG(v::bigint) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
  FROM (VALUES(1,1),(2,2),(3,NULL),(4,NULL)) t(i,v);

SELECT i,AVG(v::int) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
  FROM (VALUES(1,1),(2,2),(3,NULL),(4,NULL)) t(i,v);

SELECT i,AVG(v::smallint) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
  FROM (VALUES(1,1),(2,2),(3,NULL),(4,NULL)) t(i,v);

SELECT i,AVG(v::numeric) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
  FROM (VALUES(1,1.5),(2,2.5),(3,NULL),(4,NULL)) t(i,v);

SELECT i,AVG(v::interval) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
  FROM (VALUES(1,'1 sec'),(2,'2 sec'),(3,NULL),(4,NULL)) t(i,v);

-- moving aggregates over infinite intervals
SELECT  x
        ,avg(x) OVER(ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING ) as curr_next_avg
        ,avg(x) OVER(ROWS BETWEEN 1 PRECEDING AND CURRENT ROW ) as prev_curr_avg
        ,sum(x) OVER(ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING ) as curr_next_sum
        ,sum(x) OVER(ROWS BETWEEN 1 PRECEDING AND CURRENT ROW ) as prev_curr_sum
FROM (VALUES (NULL::interval),
               ('infinity'::interval),
               ('-2147483648 days -2147483648 months -9223372036854775807 usecs'), -- extreme interval value
               ('-infinity'::interval),
               ('2147483647 days 2147483647 months 9223372036854775806 usecs'), -- extreme interval value
               ('infinity'::interval),
               ('6 days'::interval),
               ('7 days'::interval),
               (NULL::interval),
               ('-infinity'::interval)) v(x);

--should fail.
SELECT x, avg(x) OVER(ROWS BETWEEN CURRENT ROW AND 2 FOLLOWING)
FROM (VALUES (NULL::interval),
               ('3 days'::interval),
               ('infinity'::timestamptz - now()),
               ('6 days'::interval),
               ('-infinity'::interval)) v(x);

--should fail.
SELECT x, sum(x) OVER(ROWS BETWEEN CURRENT ROW AND 2 FOLLOWING)
FROM (VALUES (NULL::interval),
               ('3 days'::interval),
               ('infinity'::timestamptz - now()),
               ('6 days'::interval),
               ('-infinity'::interval)) v(x);

SELECT i,SUM(v::smallint) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
  FROM (VALUES(1,1),(2,2),(3,NULL),(4,NULL)) t(i,v);

SELECT i,SUM(v::int) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
  FROM (VALUES(1,1),(2,2),(3,NULL),(4,NULL)) t(i,v);

SELECT i,SUM(v::bigint) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
  FROM (VALUES(1,1),(2,2),(3,NULL),(4,NULL)) t(i,v);

SELECT i,SUM(v::money) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
  FROM (VALUES(1,'1.10'),(2,'2.20'),(3,NULL),(4,NULL)) t(i,v);

SELECT i,SUM(v::interval) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
  FROM (VALUES(1,'1 sec'),(2,'2 sec'),(3,NULL),(4,NULL)) t(i,v);

SELECT i,SUM(v::numeric) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
  FROM (VALUES(1,1.1),(2,2.2),(3,NULL),(4,NULL)) t(i,v);

SELECT SUM(n::numeric) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
  FROM (VALUES(1,1.01),(2,2),(3,3)) v(i,n);

SELECT i,COUNT(v) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
  FROM (VALUES(1,1),(2,2),(3,NULL),(4,NULL)) t(i,v);

SELECT i,COUNT(*) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
  FROM (VALUES(1,1),(2,2),(3,NULL),(4,NULL)) t(i,v);

SELECT VAR_POP(n::bigint) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
  FROM (VALUES(1,600),(2,470),(3,170),(4,430),(5,300)) r(i,n);

SELECT VAR_POP(n::int) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
  FROM (VALUES(1,600),(2,470),(3,170),(4,430),(5,300)) r(i,n);

SELECT VAR_POP(n::smallint) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
  FROM (VALUES(1,600),(2,470),(3,170),(4,430),(5,300)) r(i,n);

SELECT VAR_POP(n::numeric) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
  FROM (VALUES(1,600),(2,470),(3,170),(4,430),(5,300)) r(i,n);

SELECT VAR_SAMP(n::bigint) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
  FROM (VALUES(1,600),(2,470),(3,170),(4,430),(5,300)) r(i,n);

SELECT VAR_SAMP(n::int) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
  FROM (VALUES(1,600),(2,470),(3,170),(4,430),(5,300)) r(i,n);

SELECT VAR_SAMP(n::smallint) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
  FROM (VALUES(1,600),(2,470),(3,170),(4,430),(5,300)) r(i,n);

SELECT VAR_SAMP(n::numeric) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
  FROM (VALUES(1,600),(2,470),(3,170),(4,430),(5,300)) r(i,n);

SELECT VARIANCE(n::bigint) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
  FROM (VALUES(1,600),(2,470),(3,170),(4,430),(5,300)) r(i,n);

SELECT VARIANCE(n::int) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
  FROM (VALUES(1,600),(2,470),(3,170),(4,430),(5,300)) r(i,n);

SELECT VARIANCE(n::smallint) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
  FROM (VALUES(1,600),(2,470),(3,170),(4,430),(5,300)) r(i,n);

SELECT VARIANCE(n::numeric) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
  FROM (VALUES(1,600),(2,470),(3,170),(4,430),(5,300)) r(i,n);

SELECT STDDEV_POP(n::bigint) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
  FROM (VALUES(1,NULL),(2,600),(3,470),(4,170),(5,430),(6,300)) r(i,n);

SELECT STDDEV_POP(n::int) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
  FROM (VALUES(1,NULL),(2,600),(3,470),(4,170),(5,430),(6,300)) r(i,n);

SELECT STDDEV_POP(n::smallint) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
  FROM (VALUES(1,NULL),(2,600),(3,470),(4,170),(5,430),(6,300)) r(i,n);

SELECT STDDEV_POP(n::numeric) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
  FROM (VALUES(1,NULL),(2,600),(3,470),(4,170),(5,430),(6,300)) r(i,n);

SELECT STDDEV_SAMP(n::bigint) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
  FROM (VALUES(1,NULL),(2,600),(3,470),(4,170),(5,430),(6,300)) r(i,n);

SELECT STDDEV_SAMP(n::int) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
  FROM (VALUES(1,NULL),(2,600),(3,470),(4,170),(5,430),(6,300)) r(i,n);

SELECT STDDEV_SAMP(n::smallint) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
  FROM (VALUES(1,NULL),(2,600),(3,470),(4,170),(5,430),(6,300)) r(i,n);

SELECT STDDEV_SAMP(n::numeric) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
  FROM (VALUES(1,NULL),(2,600),(3,470),(4,170),(5,430),(6,300)) r(i,n);

SELECT STDDEV(n::bigint) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
  FROM (VALUES(0,NULL),(1,600),(2,470),(3,170),(4,430),(5,300)) r(i,n);

SELECT STDDEV(n::int) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
  FROM (VALUES(0,NULL),(1,600),(2,470),(3,170),(4,430),(5,300)) r(i,n);

SELECT STDDEV(n::smallint) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
  FROM (VALUES(0,NULL),(1,600),(2,470),(3,170),(4,430),(5,300)) r(i,n);

SELECT STDDEV(n::numeric) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
  FROM (VALUES(0,NULL),(1,600),(2,470),(3,170),(4,430),(5,300)) r(i,n);

-- test that inverse transition functions work with various frame options
SELECT i,SUM(v::int) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND CURRENT ROW)
  FROM (VALUES(1,1),(2,2),(3,NULL),(4,NULL)) t(i,v);

SELECT i,SUM(v::int) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING)
  FROM (VALUES(1,1),(2,2),(3,NULL),(4,NULL)) t(i,v);

SELECT i,SUM(v::int) OVER (ORDER BY i ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING)
  FROM (VALUES(1,1),(2,2),(3,3),(4,4)) t(i,v);

-- ensure aggregate over numeric properly recovers from NaN values
SELECT a, b,
       SUM(b) OVER(ORDER BY A ROWS BETWEEN 1 PRECEDING AND CURRENT ROW)
FROM (VALUES(1,1::numeric),(2,2),(3,'NaN'),(4,3),(5,4)) t(a,b);

-- It might be tempting for someone to add an inverse trans function for
-- float and double precision. This should not be done as it can give incorrect
-- results. This test should fail if anyone ever does this without thinking too
-- hard about it.
SELECT to_char(SUM(n::float8) OVER (ORDER BY i ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING),'999999999999999999999D9')
  FROM (VALUES(1,1e20),(2,1)) n(i,n);

SELECT i, b, bool_and(b) OVER w, bool_or(b) OVER w
  FROM (VALUES (1,true), (2,true), (3,false), (4,false), (5,true)) v(i,b)
  WINDOW w AS (ORDER BY i ROWS BETWEEN CURRENT ROW AND 1 FOLLOWING);

--
-- Test WindowAgg costing takes into account the number of rows that need to
-- be fetched before the first row can be output.
--

-- Ensure we get a cheap start up plan as the WindowAgg can output the first
-- row after reading 1 row from the join.
EXPLAIN (COSTS OFF)
SELECT COUNT(*) OVER (ORDER BY t1.unique1)
FROM tenk1 t1 INNER JOIN tenk1 t2 ON t1.unique1 = t2.tenthous
LIMIT 1;

-- Ensure we get a cheap total plan.  Lack of ORDER BY in the WindowClause
-- means that all rows must be read from the join, so a cheap startup plan
-- isn't a good choice.
EXPLAIN (COSTS OFF)
SELECT COUNT(*) OVER ()
FROM tenk1 t1 INNER JOIN tenk1 t2 ON t1.unique1 = t2.tenthous
WHERE t2.two = 1
LIMIT 1;

-- Ensure we get a cheap total plan.  This time use UNBOUNDED FOLLOWING, which
-- needs to read all join rows to output the first WindowAgg row.
EXPLAIN (COSTS OFF)
SELECT COUNT(*) OVER (ORDER BY t1.unique1 ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
FROM tenk1 t1 INNER JOIN tenk1 t2 ON t1.unique1 = t2.tenthous
LIMIT 1;

-- Ensure we get a cheap total plan.  This time use 10000 FOLLOWING so we need
-- to read all join rows.
EXPLAIN (COSTS OFF)
SELECT COUNT(*) OVER (ORDER BY t1.unique1 ROWS BETWEEN UNBOUNDED PRECEDING AND 10000 FOLLOWING)
FROM tenk1 t1 INNER JOIN tenk1 t2 ON t1.unique1 = t2.tenthous
LIMIT 1;

-- Tests for problems with failure to walk or mutate expressions
-- within window frame clauses.

-- test walker (fails with collation error if expressions are not walked)
SELECT array_agg(i) OVER w
  FROM generate_series(1,5) i
WINDOW w AS (ORDER BY i ROWS BETWEEN (('foo' < 'foobar')::integer) PRECEDING AND CURRENT ROW);

-- test mutator (fails when inlined if expressions are not mutated)
CREATE FUNCTION pg_temp.f(group_size BIGINT) RETURNS SETOF integer[]
AS $$
    SELECT array_agg(s) OVER w
      FROM generate_series(1,5) s
    WINDOW w AS (ORDER BY s ROWS BETWEEN CURRENT ROW AND GROUP_SIZE FOLLOWING)
$$ LANGUAGE SQL STABLE;

EXPLAIN (costs off) SELECT * FROM pg_temp.f(2);
SELECT * FROM pg_temp.f(2);

-- IGNORE NULLS tests

CREATE TEMPORARY TABLE planets (
    name text,
    distance text,
    orbit integer
);

INSERT INTO planets VALUES
  ('mercury', 'close', 88),
  ('venus', 'close', 224),
  ('earth', 'close', NULL),
  ('mars', 'close', NULL),
  ('jupiter', 'close', 4332),
  ('saturn', 'far', 24491),
  ('uranus', 'far', NULL),
  ('neptune', 'far', 60182),
  ('pluto', 'far', 90560),
  ('xyzzy', 'far', NULL);

-- test ruleutils
CREATE VIEW planets_view AS
SELECT name,
       orbit,
       lag(orbit) OVER w AS lag,
       lag(orbit) RESPECT NULLS OVER w AS lag_respect,
       lag(orbit) IGNORE NULLS OVER w AS lag_ignore
FROM planets
WINDOW w AS (ORDER BY name)
;
SELECT pg_get_viewdef('planets_view');

-- lag
SELECT name,
       orbit,
       lag(orbit) OVER w AS lag,
       lag(orbit) RESPECT NULLS OVER w AS lag_respect,
       lag(orbit) IGNORE NULLS OVER w AS lag_ignore
FROM planets
WINDOW w AS (ORDER BY name)
;

-- lead
SELECT name,
       orbit,
       lead(orbit) OVER w AS lead,
       lead(orbit) RESPECT NULLS OVER w AS lead_respect,
       lead(orbit) IGNORE NULLS OVER w AS lead_ignore
FROM planets
WINDOW w AS (ORDER BY name)
;

-- first_value
SELECT name,
       orbit,
       first_value(orbit) RESPECT NULLS OVER w1,
       first_value(orbit) IGNORE NULLS OVER w1,
       first_value(orbit) RESPECT NULLS OVER w2,
       first_value(orbit) IGNORE NULLS OVER w2
FROM planets
WINDOW w1 AS (ORDER BY name ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING),
       w2 AS (ORDER BY name ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING)
;

-- nth_value
SELECT name,
       orbit,
       nth_value(orbit, 2) RESPECT NULLS OVER w1,
       nth_value(orbit, 2) IGNORE NULLS OVER w1,
       nth_value(orbit, 2) RESPECT NULLS OVER w2,
       nth_value(orbit, 2) IGNORE NULLS OVER w2
FROM planets
WINDOW w1 AS (ORDER BY name ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING),
       w2 AS (ORDER BY name ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING)
;

-- last_value
SELECT name,
       orbit,
       last_value(orbit) RESPECT NULLS OVER w1,
       last_value(orbit) IGNORE NULLS OVER w1,
       last_value(orbit) RESPECT NULLS OVER w2,
       last_value(orbit) IGNORE NULLS OVER w2
FROM planets
WINDOW w1 AS (ORDER BY name ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING),
       w2 AS (ORDER BY name ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING)
;

-- exclude current row
SELECT name,
       orbit,
       first_value(orbit) IGNORE NULLS OVER w,
       last_value(orbit) IGNORE NULLS OVER w,
       nth_value(orbit, 2) IGNORE NULLS OVER w,
       lead(orbit, 1) IGNORE NULLS OVER w AS lead_ignore,
       lag(orbit, 1) IGNORE NULLS OVER w AS lag_ignore
FROM planets
WINDOW w AS (ORDER BY name ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING EXCLUDE CURRENT ROW)
;

-- valid and invalid functions
SELECT sum(orbit) OVER () FROM planets; -- succeeds
SELECT sum(orbit) RESPECT NULLS OVER () FROM planets; -- fails
SELECT sum(orbit) IGNORE NULLS OVER () FROM planets; -- fails
SELECT row_number() OVER () FROM planets; -- succeeds
SELECT row_number() RESPECT NULLS OVER () FROM planets; -- fails
SELECT row_number() IGNORE NULLS OVER () FROM planets; -- fails

-- test two consecutive nulls
update planets set orbit=null where name='jupiter';
SELECT name,
       orbit,
       first_value(orbit) IGNORE NULLS OVER w,
       last_value(orbit) IGNORE NULLS OVER w,
       nth_value(orbit, 2) IGNORE NULLS OVER w,
       lead(orbit, 1) IGNORE NULLS OVER w AS lead_ignore,
       lag(orbit, 1) IGNORE NULLS OVER w AS lag_ignore
FROM planets
WINDOW w AS (ORDER BY name ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING)
;

-- test partitions
SELECT name,
       distance,
       orbit,
       first_value(orbit) IGNORE NULLS OVER w,
       last_value(orbit) IGNORE NULLS OVER w,
       nth_value(orbit, 2) IGNORE NULLS OVER w,
       lead(orbit, 1) IGNORE NULLS OVER w AS lead_ignore,
       lag(orbit, 1) IGNORE NULLS OVER w AS lag_ignore
FROM planets
WINDOW w AS (PARTITION BY distance ORDER BY name ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING)
;

-- nth_value without nulls
SELECT x,
       nth_value(x,2) IGNORE NULLS OVER w
FROM generate_series(1,5) g(x)
WINDOW w AS (ORDER BY x ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING EXCLUDE CURRENT ROW);
SELECT x,
       nth_value(x,2) IGNORE NULLS OVER w
FROM generate_series(1,5) g(x)
WINDOW w AS (ORDER BY x ROWS BETWEEN 2 PRECEDING AND 2 FOLLOWING);

--cleanup
DROP TABLE planets CASCADE;
662c662
< select x, lag(x, 1) over (order by x), lead(x, 3) over (order by x)
---
> select x, lag(x, 1) ignore nulls over (order by x), lead(x, 3) ignore nulls over (order by x)
681c681
<     last_value(ten) over (partition by four order by ten)
---
>     last_value(ten) ignore nulls over (partition by four order by ten)
709c709
<     last_value(ten) over (partition by four order by ten range between unbounded preceding and current row)
---
>     last_value(ten) ignore nulls over (partition by four order by ten range between unbounded preceding and current
row)
737c737
<     last_value(ten) over (partition by four order by ten range between unbounded preceding and unbounded following)
---
>     last_value(ten) ignore nulls over (partition by four order by ten range between unbounded preceding and unbounded
following)
765c765
<     last_value(ten/4) over (partition by four order by ten/4 range between unbounded preceding and current row)
---
>     last_value(ten/4) ignore nulls over (partition by four order by ten/4 range between unbounded preceding and
currentrow)
 
793c793
<     last_value(ten/4) over (partition by four order by ten/4 rows between unbounded preceding and current row)
---
>     last_value(ten/4) ignore nulls over (partition by four order by ten/4 rows between unbounded preceding and
currentrow)
 
938c938
< SELECT first_value(unique1) over (ORDER BY four rows between current row and 2 following exclude current row),
---
> SELECT first_value(unique1) ignore nulls over (ORDER BY four rows between current row and 2 following exclude current
row),
955c955
< SELECT first_value(unique1) over (ORDER BY four rows between current row and 2 following exclude group),
---
> SELECT first_value(unique1) ignore nulls over (ORDER BY four rows between current row and 2 following exclude
group),
972c972
< SELECT first_value(unique1) over (ORDER BY four rows between current row and 2 following exclude ties),
---
> SELECT first_value(unique1) ignore nulls over (ORDER BY four rows between current row and 2 following exclude ties),
989c989
< SELECT last_value(unique1) over (ORDER BY four rows between current row and 2 following exclude current row),
---
> SELECT last_value(unique1) ignore nulls over (ORDER BY four rows between current row and 2 following exclude current
row),
1006c1006
< SELECT last_value(unique1) over (ORDER BY four rows between current row and 2 following exclude group),
---
> SELECT last_value(unique1) ignore nulls over (ORDER BY four rows between current row and 2 following exclude group),
1023c1023
< SELECT last_value(unique1) over (ORDER BY four rows between current row and 2 following exclude ties),
---
> SELECT last_value(unique1) ignore nulls over (ORDER BY four rows between current row and 2 following exclude ties),
1159,1161c1159,1161
< SELECT first_value(unique1) over w,
<     nth_value(unique1, 2) over w AS nth_2,
<     last_value(unique1) over w, unique1, four
---
> SELECT first_value(unique1) ignore nulls over w,
>     nth_value(unique1, 2) ignore nulls over w AS nth_2,
>     last_value(unique1) ignore nulls over w, unique1, four
1631,1633c1631,1633
< select first_value(salary) over(order by salary range between 1000 preceding and 1000 following),
<     lead(salary) over(order by salary range between 1000 preceding and 1000 following),
<     nth_value(salary, 1) over(order by salary range between 1000 preceding and 1000 following),
---
> select first_value(salary) ignore nulls over(order by salary range between 1000 preceding and 1000 following),
>     lead(salary) ignore nulls over(order by salary range between 1000 preceding and 1000 following),
>     nth_value(salary, 1) ignore nulls over(order by salary range between 1000 preceding and 1000 following),
1649,1650c1649,1650
< select last_value(salary) over(order by salary range between 1000 preceding and 1000 following),
<     lag(salary) over(order by salary range between 1000 preceding and 1000 following),
---
> select last_value(salary) ignore nulls over(order by salary range between 1000 preceding and 1000 following),
>     lag(salary) ignore nulls over(order by salary range between 1000 preceding and 1000 following),
1666c1666
< select first_value(salary) over(order by salary range between 1000 following and 3000 following
---
> select first_value(salary) ignore nulls over(order by salary range between 1000 following and 3000 following
1668,1669c1668,1669
<     lead(salary) over(order by salary range between 1000 following and 3000 following exclude ties),
<     nth_value(salary, 1) over(order by salary range between 1000 following and 3000 following
---
>     lead(salary) ignore nulls over(order by salary range between 1000 following and 3000 following exclude ties),
>     nth_value(salary, 1) ignore nulls over(order by salary range between 1000 following and 3000 following
1686c1686
< select last_value(salary) over(order by salary range between 1000 following and 3000 following
---
> select last_value(salary) ignore nulls over(order by salary range between 1000 following and 3000 following
1688c1688
<     lag(salary) over(order by salary range between 1000 following and 3000 following exclude group),
---
>     lag(salary) ignore nulls over(order by salary range between 1000 following and 3000 following exclude group),
1704c1704
< select first_value(salary) over(order by enroll_date range between unbounded preceding and '1 year'::interval
following
---
> select first_value(salary) ignore nulls over(order by enroll_date range between unbounded preceding and '1
year'::intervalfollowing
 
1706c1706
<     last_value(salary) over(order by enroll_date range between unbounded preceding and '1 year'::interval
following),
---
>     last_value(salary) ignore nulls over(order by enroll_date range between unbounded preceding and '1
year'::intervalfollowing),
 
1722c1722
< select first_value(salary) over(order by enroll_date range between unbounded preceding and '1 year'::interval
following
---
> select first_value(salary) ignore nulls over(order by enroll_date range between unbounded preceding and '1
year'::intervalfollowing
 
1724c1724
<     last_value(salary) over(order by enroll_date range between unbounded preceding and '1 year'::interval following
---
>     last_value(salary) ignore nulls over(order by enroll_date range between unbounded preceding and '1
year'::intervalfollowing
 
1741c1741
< select first_value(salary) over(order by enroll_date range between unbounded preceding and '1 year'::interval
following
---
> select first_value(salary) ignore nulls over(order by enroll_date range between unbounded preceding and '1
year'::intervalfollowing
 
1743c1743
<     last_value(salary) over(order by enroll_date range between unbounded preceding and '1 year'::interval following
---
>     last_value(salary) ignore nulls over(order by enroll_date range between unbounded preceding and '1
year'::intervalfollowing
 
1760c1760
< select first_value(salary) over(order by enroll_date range between unbounded preceding and '1 year'::interval
following
---
> select first_value(salary) ignore nulls over(order by enroll_date range between unbounded preceding and '1
year'::intervalfollowing
 
1762c1762
<     last_value(salary) over(order by enroll_date range between unbounded preceding and '1 year'::interval following
---
>     last_value(salary) ignore nulls over(order by enroll_date range between unbounded preceding and '1
year'::intervalfollowing
 
1781,1782c1781,1782
<        first_value(y) over w,
<        last_value(y) over w
---
>        first_value(y) ignore nulls over w,
>        last_value(y) ignore nulls over w
1801,1802c1801,1802
<        first_value(y) over w,
<        last_value(y) over w
---
>        first_value(y) ignore nulls over w,
>        last_value(y) ignore nulls over w
1821,1822c1821,1822
<        first_value(y) over w,
<        last_value(y) over w
---
>        first_value(y) ignore nulls over w,
>        last_value(y) ignore nulls over w
1841,1842c1841,1842
<        first_value(y) over w,
<        last_value(y) over w
---
>        first_value(y) ignore nulls over w,
>        last_value(y) ignore nulls over w
2001c2001
< select x, last_value(x) over (order by x::smallint range between current row and 2147450884 following)
---
> select x, last_value(x) ignore nulls over (order by x::smallint range between current row and 2147450884 following)
2010c2010
< select x, last_value(x) over (order by x::smallint desc range between current row and 2147450885 following)
---
> select x, last_value(x) ignore nulls over (order by x::smallint desc range between current row and 2147450885
following)
2019c2019
< select x, last_value(x) over (order by x range between current row and 4 following)
---
> select x, last_value(x) ignore nulls over (order by x range between current row and 4 following)
2028c2028
< select x, last_value(x) over (order by x desc range between current row and 5 following)
---
> select x, last_value(x) ignore nulls over (order by x desc range between current row and 5 following)
2037c2037
< select x, last_value(x) over (order by x range between current row and 4 following)
---
> select x, last_value(x) ignore nulls over (order by x range between current row and 4 following)
2046c2046
< select x, last_value(x) over (order by x desc range between current row and 5 following)
---
> select x, last_value(x) ignore nulls over (order by x desc range between current row and 5 following)
2073c2073
< select id, f_float4, first_value(id) over w, last_value(id) over w
---
> select id, f_float4, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2091c2091
< select id, f_float4, first_value(id) over w, last_value(id) over w
---
> select id, f_float4, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2109c2109
< select id, f_float4, first_value(id) over w, last_value(id) over w
---
> select id, f_float4, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2127c2127
< select id, f_float4, first_value(id) over w, last_value(id) over w
---
> select id, f_float4, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2145c2145
< select id, f_float4, first_value(id) over w, last_value(id) over w
---
> select id, f_float4, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2163c2163
< select id, f_float4, first_value(id) over w, last_value(id) over w
---
> select id, f_float4, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2168c2168
< select id, f_float8, first_value(id) over w, last_value(id) over w
---
> select id, f_float8, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2186c2186
< select id, f_float8, first_value(id) over w, last_value(id) over w
---
> select id, f_float8, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2204c2204
< select id, f_float8, first_value(id) over w, last_value(id) over w
---
> select id, f_float8, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2222c2222
< select id, f_float8, first_value(id) over w, last_value(id) over w
---
> select id, f_float8, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2240c2240
< select id, f_float8, first_value(id) over w, last_value(id) over w
---
> select id, f_float8, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2258c2258
< select id, f_float8, first_value(id) over w, last_value(id) over w
---
> select id, f_float8, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2263c2263
< select id, f_numeric, first_value(id) over w, last_value(id) over w
---
> select id, f_numeric, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2281c2281
< select id, f_numeric, first_value(id) over w, last_value(id) over w
---
> select id, f_numeric, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2299c2299
< select id, f_numeric, first_value(id) over w, last_value(id) over w
---
> select id, f_numeric, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2307c2307
< select id, f_numeric, first_value(id) over w, last_value(id) over w
---
> select id, f_numeric, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2325c2325
< select id, f_numeric, first_value(id) over w, last_value(id) over w
---
> select id, f_numeric, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2343c2343
< select id, f_numeric, first_value(id) over w, last_value(id) over w
---
> select id, f_numeric, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2361c2361
< select id, f_numeric, first_value(id) over w, last_value(id) over w
---
> select id, f_numeric, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2388c2388
< select id, f_time, first_value(id) over w, last_value(id) over w
---
> select id, f_time, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2408c2408
< select id, f_time, first_value(id) over w, last_value(id) over w
---
> select id, f_time, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2428c2428
< select id, f_time, first_value(id) over w, last_value(id) over w
---
> select id, f_time, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2433c2433
< select id, f_time, first_value(id) over w, last_value(id) over w
---
> select id, f_time, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2453c2453
< select id, f_time, first_value(id) over w, last_value(id) over w
---
> select id, f_time, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2473c2473
< select id, f_time, first_value(id) over w, last_value(id) over w
---
> select id, f_time, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2493c2493
< select id, f_time, first_value(id) over w, last_value(id) over w
---
> select id, f_time, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2499c2499
< select id, f_timetz, first_value(id) over w, last_value(id) over w
---
> select id, f_timetz, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2519c2519
< select id, f_timetz, first_value(id) over w, last_value(id) over w
---
> select id, f_timetz, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2539c2539
< select id, f_timetz, first_value(id) over w, last_value(id) over w
---
> select id, f_timetz, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2544c2544
< select id, f_timetz, first_value(id) over w, last_value(id) over w
---
> select id, f_timetz, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2564c2564
< select id, f_timetz, first_value(id) over w, last_value(id) over w
---
> select id, f_timetz, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2584c2584
< select id, f_timetz, first_value(id) over w, last_value(id) over w
---
> select id, f_timetz, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2604c2604
< select id, f_timetz, first_value(id) over w, last_value(id) over w
---
> select id, f_timetz, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2610c2610
< select id, f_interval, first_value(id) over w, last_value(id) over w
---
> select id, f_interval, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2630c2630
< select id, f_interval, first_value(id) over w, last_value(id) over w
---
> select id, f_interval, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2650c2650
< select id, f_interval, first_value(id) over w, last_value(id) over w
---
> select id, f_interval, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2655c2655
< select id, f_interval, first_value(id) over w, last_value(id) over w
---
> select id, f_interval, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2675c2675
< select id, f_interval, first_value(id) over w, last_value(id) over w
---
> select id, f_interval, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2695c2695
< select id, f_interval, first_value(id) over w, last_value(id) over w
---
> select id, f_interval, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2715c2715
< select id, f_interval, first_value(id) over w, last_value(id) over w
---
> select id, f_interval, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2721c2721
< select id, f_timestamptz, first_value(id) over w, last_value(id) over w
---
> select id, f_timestamptz, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2741c2741
< select id, f_timestamptz, first_value(id) over w, last_value(id) over w
---
> select id, f_timestamptz, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2761c2761
< select id, f_timestamptz, first_value(id) over w, last_value(id) over w
---
> select id, f_timestamptz, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2766c2766
< select id, f_timestamptz, first_value(id) over w, last_value(id) over w
---
> select id, f_timestamptz, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2786c2786
< select id, f_timestamptz, first_value(id) over w, last_value(id) over w
---
> select id, f_timestamptz, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2806c2806
< select id, f_timestamptz, first_value(id) over w, last_value(id) over w
---
> select id, f_timestamptz, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2826c2826
< select id, f_timestamptz, first_value(id) over w, last_value(id) over w
---
> select id, f_timestamptz, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2832c2832
< select id, f_timestamp, first_value(id) over w, last_value(id) over w
---
> select id, f_timestamp, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2852c2852
< select id, f_timestamp, first_value(id) over w, last_value(id) over w
---
> select id, f_timestamp, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2872c2872
< select id, f_timestamp, first_value(id) over w, last_value(id) over w
---
> select id, f_timestamp, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2877c2877
< select id, f_timestamp, first_value(id) over w, last_value(id) over w
---
> select id, f_timestamp, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2897c2897
< select id, f_timestamp, first_value(id) over w, last_value(id) over w
---
> select id, f_timestamp, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2917c2917
< select id, f_timestamp, first_value(id) over w, last_value(id) over w
---
> select id, f_timestamp, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
2937c2937
< select id, f_timestamp, first_value(id) over w, last_value(id) over w
---
> select id, f_timestamp, first_value(id) ignore nulls over w, last_value(id) ignore nulls over w
3253,3255c3253,3255
< select first_value(salary) over(order by enroll_date groups between 1 preceding and 1 following),
<     lead(salary) over(order by enroll_date groups between 1 preceding and 1 following),
<     nth_value(salary, 1) over(order by enroll_date groups between 1 preceding and 1 following),
---
> select first_value(salary) ignore nulls over(order by enroll_date groups between 1 preceding and 1 following),
>     lead(salary) ignore nulls over(order by enroll_date groups between 1 preceding and 1 following),
>     nth_value(salary, 1) ignore nulls over(order by enroll_date groups between 1 preceding and 1 following),
3271,3272c3271,3272
< select last_value(salary) over(order by enroll_date groups between 1 preceding and 1 following),
<     lag(salary) over(order by enroll_date groups between 1 preceding and 1 following),
---
> select last_value(salary) ignore nulls over(order by enroll_date groups between 1 preceding and 1 following),
>     lag(salary) ignore nulls over(order by enroll_date groups between 1 preceding and 1 following),
3288c3288
< select first_value(salary) over(order by enroll_date groups between 1 following and 3 following
---
> select first_value(salary) ignore nulls over(order by enroll_date groups between 1 following and 3 following
3290,3291c3290,3291
<     lead(salary) over(order by enroll_date groups between 1 following and 3 following exclude ties),
<     nth_value(salary, 1) over(order by enroll_date groups between 1 following and 3 following
---
>     lead(salary) ignore nulls over(order by enroll_date groups between 1 following and 3 following exclude ties),
>     nth_value(salary, 1) ignore nulls over(order by enroll_date groups between 1 following and 3 following
3308c3308
< select last_value(salary) over(order by enroll_date groups between 1 following and 3 following
---
> select last_value(salary) ignore nulls over(order by enroll_date groups between 1 following and 3 following
3310c3310
<     lag(salary) over(order by enroll_date groups between 1 following and 3 following exclude group),
---
>     lag(salary) ignore nulls over(order by enroll_date groups between 1 following and 3 following exclude group),

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Tatsuo Ishii
Дата:
Attached are the v17 patches for adding RESPECT/IGNORE NULLS options
defined in the standard to some window functions. FROM FIRST/LAST
options are not considered in the patch (yet).

This time I split the patch into 6
patches for reviewer's convenience. Also each patch has a short commit
message to explain the patch.

0001: parse and analysis
0002: rewriter
0003: planner
0004: executor
0005: documents
0006: tests

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp


Вложения

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Tatsuo Ishii
Дата:
Attached are the v18 patches for adding RESPECT/IGNORE NULLS options
to some window functions. Recent changes to doc/src/sgml/func.sgml
required v17 to be rebased. Other than that, nothing has been changed.

Oliver, do you have any comments on the patches?

> Attached are the v17 patches for adding RESPECT/IGNORE NULLS options
> defined in the standard to some window functions. FROM FIRST/LAST
> options are not considered in the patch (yet).
> 
> This time I split the patch into 6
> patches for reviewer's convenience. Also each patch has a short commit
> message to explain the patch.
> 
> 0001: parse and analysis
> 0002: rewriter
> 0003: planner
> 0004: executor
> 0005: documents
> 0006: tests

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

Вложения

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Oliver Ford
Дата:
On Sat, Aug 16, 2025 at 10:33 AM Tatsuo Ishii <ishii@postgresql.org> wrote:
Attached are the v18 patches for adding RESPECT/IGNORE NULLS options
to some window functions. Recent changes to doc/src/sgml/func.sgml
required v17 to be rebased. Other than that, nothing has been changed.

Oliver, do you have any comments on the patches?

Looks good, tried it on the nth_value test script from a bit ago - I added a 1 million rows test and it takes an average of 12 seconds on my i7.
 

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Tatsuo Ishii
Дата:
>> Attached are the v18 patches for adding RESPECT/IGNORE NULLS options
>> to some window functions. Recent changes to doc/src/sgml/func.sgml
>> required v17 to be rebased. Other than that, nothing has been changed.
>>
>> Oliver, do you have any comments on the patches?
>>
> 
> Looks good, tried it on the nth_value test script from a bit ago - I added
> a 1 million rows test and it takes an average of 12 seconds on my i7.

Thanks.

I have moved the CF entry from PG19-1 to PG19-2 as PG19-1 has been
already closed on July 31. Hope this help CF bot to catch the v18
patches.

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp



Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Tatsuo Ishii
Дата:
>> Attached are the v18 patches for adding RESPECT/IGNORE NULLS options
>> to some window functions. Recent changes to doc/src/sgml/func.sgml
>> required v17 to be rebased. Other than that, nothing has been changed.
>>
>> Oliver, do you have any comments on the patches?
>>
> 
> Looks good, tried it on the nth_value test script from a bit ago - I added
> a 1 million rows test and it takes an average of 12 seconds on my i7.

I would like to push the patch by the end of this month or early in
October if there's no objection.

Comments/suggestions are welcome.
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp



Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Chao Li
Дата:
Overall LGTM. Just a few small comments:

On Sep 12, 2025, at 17:53, Tatsuo Ishii <ishii@postgresql.org> wrote:


Comments/suggestions are welcome.
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp


1 - 0001
```
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -98,6 +98,7 @@ ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs,
  bool agg_star = (fn ? fn->agg_star : false);
  bool agg_distinct = (fn ? fn->agg_distinct : false);
  bool func_variadic = (fn ? fn->func_variadic : false);
+ int ignore_nulls = (fn ? fn->ignore_nulls : 0);
```

Should we use the constant NO_NULLTREATMENT here for 0?

2 - 0001
```
--- a/src/include/nodes/primnodes.h
+++ b/src/include/nodes/primnodes.h
@@ -579,6 +579,17 @@ typedef struct GroupingFunc
  * Collation information is irrelevant for the query jumbling, as is the
  * internal state information of the node like "winstar" and "winagg".
  */
+
+/*
+ * Null Treatment options. If specified, initially set to PARSER_IGNORE_NULLS
+ * which is then converted to IGNORE_NULLS if the window function allows the
+ * null treatment clause.
+ */
+#define NO_NULLTREATMENT 0
+#define PARSER_IGNORE_NULLS 1
+#define PARSER_RESPECT_NULLS 2
+#define IGNORE_NULLS 3
+
 typedef struct WindowFunc
 {
  Expr xpr;
@@ -602,6 +613,8 @@ typedef struct WindowFunc
  bool winstar pg_node_attr(query_jumble_ignore);
  /* is function a simple aggregate? */
  bool winagg pg_node_attr(query_jumble_ignore);
+ /* ignore nulls. One of the Null Treatment options */
+ int ignore_nulls;
```

Maybe we can use “uint8” type for “ignore_nulls”. Because the previous two are both of type “bool”, an uint8 will just fit to the padding bytes, so that new field won’t add extra memory to the structure.

3 - 0004
```
  winobj->markpos = -1;
  winobj->seekpos = -1;
+
+ /* reset null map */
+ if (perfuncstate->winobj->ignore_nulls == IGNORE_NULLS)
+ memset(perfuncstate->winobj->notnull_info, 0,
+   NN_POS_TO_BYTES(perfuncstate->winobj->num_notnull_info));
  }
```
Where in “if” and “memset()”, we can just use “winobj”.

4 - 0004
```
+ if (!HeapTupleIsValid(proctup))
+ elog(ERROR, "cache lookup failed for function %u", funcid);
+ procform = (Form_pg_proc) GETSTRUCT(proctup);
+ elog(ERROR, "function %s does not allow RESPECT/IGNORE NULLS",
+ NameStr(procform->proname));
```

“Procform” is assigned but not used.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/




Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Tatsuo Ishii
Дата:
> Overall LGTM. Just a few small comments:

> 1 - 0001
> ```
> --- a/src/backend/parser/parse_func.c
> +++ b/src/backend/parser/parse_func.c
> @@ -98,6 +98,7 @@ ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs,
>      bool        agg_star = (fn ? fn->agg_star : false);
>      bool        agg_distinct = (fn ? fn->agg_distinct : false);
>      bool        func_variadic = (fn ? fn->func_variadic : false);
> +    int            ignore_nulls = (fn ? fn->ignore_nulls : 0);
> ```
> 
> Should we use the constant NO_NULLTREATMENT here for 0?

Good suggestion. Will fix.

> 2 - 0001
> ```
> --- a/src/include/nodes/primnodes.h
> +++ b/src/include/nodes/primnodes.h
> @@ -579,6 +579,17 @@ typedef struct GroupingFunc
>   * Collation information is irrelevant for the query jumbling, as is the
>   * internal state information of the node like "winstar" and "winagg".
>   */
> +
> +/*
> + * Null Treatment options. If specified, initially set to PARSER_IGNORE_NULLS
> + * which is then converted to IGNORE_NULLS if the window function allows the
> + * null treatment clause.
> + */
> +#define NO_NULLTREATMENT 0
> +#define PARSER_IGNORE_NULLS 1
> +#define PARSER_RESPECT_NULLS 2
> +#define IGNORE_NULLS 3
> +
>  typedef struct WindowFunc
>  {
>      Expr        xpr;
> @@ -602,6 +613,8 @@ typedef struct WindowFunc
>      bool        winstar pg_node_attr(query_jumble_ignore);
>      /* is function a simple aggregate? */
>      bool        winagg pg_node_attr(query_jumble_ignore);
> +    /* ignore nulls. One of the Null Treatment options */
> +    int            ignore_nulls;
> ```
> 
> Maybe we can use “uint8” type for “ignore_nulls”. Because the previous two are both of type “bool”, an uint8 will
justfit to the padding bytes, so that new field won’t add extra memory to the structure.
 

If we change the data type for ignore_nulls in WindowFunc, we may also
want to change it elsewhere (FuncCall, WindowObjectData,
WindowStatePerFuncData) for consistency?

> 3 - 0004
> ```
>              winobj->markpos = -1;
>              winobj->seekpos = -1;
> +
> +            /* reset null map */
> +            if (perfuncstate->winobj->ignore_nulls == IGNORE_NULLS)
> +                memset(perfuncstate->winobj->notnull_info, 0,
> +                       NN_POS_TO_BYTES(perfuncstate->winobj->num_notnull_info));
>          }
> ```
> Where in “if” and “memset()”, we can just use “winobj”.

Good catch. Will fix.

> 4 - 0004
> ```
> +        if (!HeapTupleIsValid(proctup))
> +            elog(ERROR, "cache lookup failed for function %u", funcid);
> +        procform = (Form_pg_proc) GETSTRUCT(proctup);
> +        elog(ERROR, "function %s does not allow RESPECT/IGNORE NULLS",
> +             NameStr(procform->proname));
> ```
> 
> “Procform” is assigned but not used.

I think procform is used in the following elog(ERROR, ...).

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp



Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Tatsuo Ishii
Дата:
Attached is the updated v19 patches. Mostly applied changes suggested
by Chao.

>> Overall LGTM. Just a few small comments:
> 
>> 1 - 0001
>> ```
>> --- a/src/backend/parser/parse_func.c
>> +++ b/src/backend/parser/parse_func.c
>> @@ -98,6 +98,7 @@ ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs,
>>      bool        agg_star = (fn ? fn->agg_star : false);
>>      bool        agg_distinct = (fn ? fn->agg_distinct : false);
>>      bool        func_variadic = (fn ? fn->func_variadic : false);
>> +    int            ignore_nulls = (fn ? fn->ignore_nulls : 0);
>> ```
>> 
>> Should we use the constant NO_NULLTREATMENT here for 0?
> 
> Good suggestion. Will fix.

Done.

>> 2 - 0001
>> ```
>> --- a/src/include/nodes/primnodes.h
>> +++ b/src/include/nodes/primnodes.h
>> @@ -579,6 +579,17 @@ typedef struct GroupingFunc
>>   * Collation information is irrelevant for the query jumbling, as is the
>>   * internal state information of the node like "winstar" and "winagg".
>>   */
>> +
>> +/*
>> + * Null Treatment options. If specified, initially set to PARSER_IGNORE_NULLS
>> + * which is then converted to IGNORE_NULLS if the window function allows the
>> + * null treatment clause.
>> + */
>> +#define NO_NULLTREATMENT 0
>> +#define PARSER_IGNORE_NULLS 1
>> +#define PARSER_RESPECT_NULLS 2
>> +#define IGNORE_NULLS 3
>> +
>>  typedef struct WindowFunc
>>  {
>>      Expr        xpr;
>> @@ -602,6 +613,8 @@ typedef struct WindowFunc
>>      bool        winstar pg_node_attr(query_jumble_ignore);
>>      /* is function a simple aggregate? */
>>      bool        winagg pg_node_attr(query_jumble_ignore);
>> +    /* ignore nulls. One of the Null Treatment options */
>> +    int            ignore_nulls;
>> ```
>> 
>> Maybe we can use “uint8” type for “ignore_nulls”. Because the previous two are both of type “bool”, an uint8 will
justfit to the padding bytes, so that new field won’t add extra memory to the structure.
 
> 
> If we change the data type for ignore_nulls in WindowFunc, we may also
> want to change it elsewhere (FuncCall, WindowObjectData,
> WindowStatePerFuncData) for consistency?

I tried to change all "int ignore_nulls;" to "uint8 ignore_nulls;" but
gen_node_support.pl dislikes it and complains like:

  could not handle type "uint8" in struct "FuncCall" field "ignore_nulls"

>> 3 - 0004
>> ```
>>              winobj->markpos = -1;
>>              winobj->seekpos = -1;
>> +
>> +            /* reset null map */
>> +            if (perfuncstate->winobj->ignore_nulls == IGNORE_NULLS)
>> +                memset(perfuncstate->winobj->notnull_info, 0,
>> +                       NN_POS_TO_BYTES(perfuncstate->winobj->num_notnull_info));
>>          }
>> ```
>> Where in “if” and “memset()”, we can just use “winobj”.
> 
> Good catch. Will fix.

Done.

>> 4 - 0004
>> ```
>> +        if (!HeapTupleIsValid(proctup))
>> +            elog(ERROR, "cache lookup failed for function %u", funcid);
>> +        procform = (Form_pg_proc) GETSTRUCT(proctup);
>> +        elog(ERROR, "function %s does not allow RESPECT/IGNORE NULLS",
>> +             NameStr(procform->proname));
>> ```
>> 
>> “Procform” is assigned but not used.
> 
> I think procform is used in the following elog(ERROR, ...).

I added more tests for functions (rank(), dense_rank(),
percent_rank(), cume_dist() and ntile()) that do not support
RESPECT/IGNORE NULLS options to confirm that they throw errors if the
options are given. Previously there was only test cases for
row_number().

Also I have made small cosmetic changes to executor/nodeWindowAgg.c to
make too long lines shorter.

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

Вложения

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Oliver Ford
Дата:
On Wed, Sep 24, 2025 at 6:39 AM Tatsuo Ishii <ishii@postgresql.org> wrote:
I tried to change all "int ignore_nulls;" to "uint8 ignore_nulls;" but
gen_node_support.pl dislikes it and complains like:

  could not handle type "uint8" in struct "FuncCall" field "ignore_nulls"


uint8 was missing in one place in that perl script. The attached patch silences it for uint8/uint16. 
Вложения

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Tatsuo Ishii
Дата:
Hi Oliver,
> On Wed, Sep 24, 2025 at 6:39 AM Tatsuo Ishii <ishii@postgresql.org> wrote:
> 
>> I tried to change all "int ignore_nulls;" to "uint8 ignore_nulls;" but
>> gen_node_support.pl dislikes it and complains like:
>>
>>   could not handle type "uint8" in struct "FuncCall" field "ignore_nulls"
>>
>>
> uint8 was missing in one place in that perl script. The attached patch
> silences it for uint8/uint16.

Thank you for the patch. (I noticed int8 is also missing).

I have looked into the commit 964d01ae90c3 which was made by Peter.  I
have quick read through the discussion to know why uint8/uint16 (and
int8) are missing in gen_node_support.pl. Unfortunately I have no
clear idea why these data types are missing in the script.

Peter,
Maybe you wanted to limit the data types that are actually used at
that point? If so, probably we should only add uint8 support this time
(uint8 is only needed to implement $Subject for now).  What do you
think?

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Tatsuo Ishii
Дата:
> Thank you for the patch. (I noticed int8 is also missing).
> 
> I have looked into the commit 964d01ae90c3 which was made by Peter.  I
> have quick read through the discussion to know why uint8/uint16 (and
> int8) are missing in gen_node_support.pl. Unfortunately I have no
> clear idea why these data types are missing in the script.
> 
> Peter,
> Maybe you wanted to limit the data types that are actually used at
> that point? If so, probably we should only add uint8 support this time
> (uint8 is only needed to implement $Subject for now).  What do you
> think?

I decided not to include the fix to gen_node_support.pl for now and
commit the patch without it. We could revisit it later on.

So here is the commit message I would like to propose.

For the technical part please look at the message.

Non technical part:
First of all the author is Oliver (no doubt).  I would like to be
listed as a co-author since I wrote the not null cache part.

Next is reviewers. Actually the first effor to implement null
treatment clause was back to 9.3 era (2013) at least. After that
multiple trials to implemnt the feature happend but they had faded
away. I think we don't need to include all of those who joined the old
discussions as reviewers. So I started to check from the discussion:
https://postgr.es/m/flat/CAGMVOdsbtRwE_4+v8zjH1d9xfovDeQAGLkP_B6k69_VoFEgX-A@mail.gmail.com
because it's refered to by the commit fest entry.

Oliver and others, I love to hear your comment!

BTW, Oliver's last patch made the CF bot to misunderstand the patch
contents, which was not actually the main patch. So I attach the same
patch as v20.

----------------------------------------------------------------------
Add IGNORE NULLS/RESPECT NULLS option to Window functions.

Add IGNORE NULLS/RESPECT NULLS option (null treatment clause) to lead,
lag, first_value, last_value and nth_value window functions.  If
unspecified, the default is RESPECT NULLS which includes NULL values
in any result calculation. IGNORE NULLS ignores NULL values.

Built-in window functions are modified to call new API
WinCheckAndInitializeNullTreatment() to indicate whether they accept
IGNORE NULLS/RESPECT NULLS option or not (the API can be called by
user defined window functions as well).  If WinGetFuncArgInPartition's
allowNullTreatment argument is true and IGNORE NULLS option is given,
WinGetFuncArgInPartition() or WinGetFuncArgInFrame() will return
evaluated function's argument expression on specified non NULL row (if
it exists) in the partition or the frame.

When IGNORE NULLS option is given, window functions need to visit and
evaluate same rows over and over again to look for non null rows. To
mitigate the issue, 2-bit not null information array is created while
executing window functions to remember whether the row has been
already evaluated to NULL or NOT NULL. If already evaluated, we could
skip some the evaluation work, thus we could get better performance.

Author: Oliver Ford <ojford@gmail.com>
Co-authored-by: Tatsuo Ishii <ishii@postgresql.org>
Reviewed-by: Andrew Gierth <andrew@tao11.riddles.org.uk>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: David Fetter <david@fetter.org>
Reviewed-by: Vik Fearing <vik@postgresfriends.org>
Reviewed-by: "David G. Johnston" <david.g.johnston@gmail.com>
Reviewed-by: Krasiyan Andreev <krasiyan@gmail.com>
Reviewed-by: Chao Li <lic@highgo.com>
Discussion: https://postgr.es/m/flat/CAGMVOdsbtRwE_4+v8zjH1d9xfovDeQAGLkP_B6k69_VoFEgX-A@mail.gmail.com
----------------------------------------------------------------------

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

Вложения

Re: Add RESPECT/IGNORE NULLS and FROM FIRST/LAST options

От
Oliver Ford
Дата:
On Thu, 2 Oct 2025, 13:16 Tatsuo Ishii, <ishii@postgresql.org> wrote:
> Thank you for the patch. (I noticed int8 is also missing).
>
> I have looked into the commit 964d01ae90c3 which was made by Peter.  I
> have quick read through the discussion to know why uint8/uint16 (and
> int8) are missing in gen_node_support.pl. Unfortunately I have no
> clear idea why these data types are missing in the script.
>
> Peter,
> Maybe you wanted to limit the data types that are actually used at
> that point? If so, probably we should only add uint8 support this time
> (uint8 is only needed to implement $Subject for now).  What do you
> think?

I decided not to include the fix to gen_node_support.pl for now and
commit the patch without it. We could revisit it later on.

So here is the commit message I would like to propose.

For the technical part please look at the message.

Non technical part:
First of all the author is Oliver (no doubt).  I would like to be
listed as a co-author since I wrote the not null cache part.

Next is reviewers. Actually the first effor to implement null
treatment clause was back to 9.3 era (2013) at least. After that
multiple trials to implemnt the feature happend but they had faded
away. I think we don't need to include all of those who joined the old
discussions as reviewers. So I started to check from the discussion:
https://postgr.es/m/flat/CAGMVOdsbtRwE_4+v8zjH1d9xfovDeQAGLkP_B6k69_VoFEgX-A@mail.gmail.com
because it's refered to by the commit fest entry.

Oliver and others, I love to hear your comment!

Looks great, so glad this is finally going in.