Обсуждение: Proposal: QUALIFY clause
Hi all, I'm sending a proof-of-concept patch to add support for the QUALIFY clause in Postgres. This feature allows filtering rows after window functions are computed, using a syntax similar to the WHERE or HAVING clauses. The idea for this came from a discussion and suggestion by Peter Eisentraut (thanks, Peter!). The `QUALIFY` clause is not part of the SQL standard, but it is implemented by some major DBMSs, including Snowflake [1], BigQuery [2] and DuckDB [3]. The goal is to provide a more ergonomic way to filter on window function results without needing to nest subqueries or CTEs. Simple example (see window.sql for more): SELECT depname, empno, salary, RANK() OVER (PARTITION BY depname ORDER BY salary DESC) AS rnk FROM empsalary QUALIFY rnk = 1; Please note that this is a proof-of-concept patch, I’m still working on determining the best locations in the code to implement each part of the logic for QUALIFY. I'm just sending this WIP to collect feedback and then continue to work on the feature. Additionally, the current patch does not handle yet expressions using AND/OR when referencing multiple window function aliases (e.g., QUALIFY rnk = 1 AND rnk2 = 2). Thoughts? [1] https://docs.snowflake.com/en/sql-reference/constructs/qualify [2] https://cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax#qualify_clause [3] https://duckdb.org/docs/stable/sql/query_syntax/qualify.html -- Matheus Alcantara
Вложения
Many times I have thought it would be nice if there was a QUALIFY clause in Postgres!
Just would like to add that including your list, Teradata, Redshift, SAP HANA, HP Vertica, and Trino all support the QUALIFY clause.
Also it seems Postgres would be the first leading RDBMS - meaning like traditional, multipurpose RDMBS - to support QUALIFY, which would be pretty cool.
On Mon, Jul 21, 2025 at 7:47 AM Matheus Alcantara <matheusssilv97@gmail.com> wrote:
Hi all,
I'm sending a proof-of-concept patch to add support for the QUALIFY
clause in Postgres. This feature allows filtering rows after window
functions are computed, using a syntax similar to the WHERE or HAVING
clauses.
The idea for this came from a discussion and suggestion by Peter
Eisentraut (thanks, Peter!).
The `QUALIFY` clause is not part of the SQL standard, but it is
implemented by some major DBMSs, including Snowflake [1], BigQuery
[2] and DuckDB [3].
The goal is to provide a more ergonomic way to filter on window function
results without needing to nest subqueries or CTEs.
Simple example (see window.sql for more):
SELECT depname,
empno,
salary,
RANK() OVER (PARTITION BY depname ORDER BY salary DESC) AS rnk
FROM empsalary
QUALIFY rnk = 1;
Please note that this is a proof-of-concept patch, I’m still working on
determining the best locations in the code to implement each part of the
logic for QUALIFY. I'm just sending this WIP to collect feedback and then
continue to work on the feature. Additionally, the current patch does not
handle yet expressions using AND/OR when referencing multiple window
function aliases (e.g., QUALIFY rnk = 1 AND rnk2 = 2).
Thoughts?
[1] https://docs.snowflake.com/en/sql-reference/constructs/qualify
[2] https://cloud.google.com/bigquery/docs/reference/standard-sql/query-syntax#qualify_clause
[3] https://duckdb.org/docs/stable/sql/query_syntax/qualify.html
--
Matheus Alcantara
On Mon, 21 Jul 2025 at 10:19, Mike Artz <michaeleartz@gmail.com> wrote:
Many times I have thought it would be nice if there was a QUALIFY clause in Postgres!Just would like to add that including your list, Teradata, Redshift, SAP HANA, HP Vertica, and Trino all support the QUALIFY clause.Also it seems Postgres would be the first leading RDBMS - meaning like traditional, multipurpose RDMBS - to support QUALIFY, which would be pretty cool.
Is this different from using the window functions in a subquery and then applying a WHERE clause on the outer query?
SELECT … FROM (SELECT … [including window functions] FROM …) WHERE [stuff that would be in QUALIFY]
I'll be honest, I'm skeptical that we need another keyword that basically means “WHERE, but applied at a different point in the query processing”. I'm not even convinced that HAVING was a good idea (although obviously I would not propose removal).
Isaac Morland <isaac.morland@gmail.com> writes: > I'll be honest, I'm skeptical that we need another keyword that basically > means “WHERE, but applied at a different point in the query processing”. That was my reaction too. I'm especially skeptical that getting out front of the SQL standards committee is a good thing to do. If and when this shows up in the standard, then sure. regards, tom lane
On Mon Jul 21, 2025 at 11:29 AM -03, Isaac Morland wrote: > Is this different from using the window functions in a subquery and then > applying a WHERE clause on the outer query? > > SELECT … FROM (SELECT … [including window functions] FROM …) WHERE [stuff > that would be in QUALIFY] > > I'll be honest, I'm skeptical that we need another keyword that basically > means “WHERE, but applied at a different point in the query processing”. > I'm not even convinced that HAVING was a good idea (although obviously I > would not propose removal). > Thanks for sharing your thoughts! You're right — semantically, using QUALIFY is equivalent to wrapping the query in a subquery and applying a WHERE clause to the result. The main motivation here is to provide a more ergonomic and readable syntax. While I understand the hesitation around introducing another keyword that effectively acts like WHERE at a different stage, I believe QUALIFY improves clarity in many use cases, by avoiding the boilerplate and visual noise of nested subqueries making it easier to write and reason about. -- Matheus Alcantara
"Matheus Alcantara" <matheusssilv97@gmail.com> writes: > You're right — semantically, using QUALIFY is equivalent to wrapping the > query in a subquery and applying a WHERE clause to the result. The main > motivation here is to provide a more ergonomic and readable syntax. > While I understand the hesitation around introducing another keyword > that effectively acts like WHERE at a different stage, I believe QUALIFY > improves clarity in many use cases, by avoiding the boilerplate and > visual noise of nested subqueries making it easier to write and reason > about. There are concrete reasons not to do this until/unless it becomes standardized: * If the syntax is like WHERE, there will be no way to do it without making QUALIFY a fully-reserved word. That will inevitably break more than zero applications. It's a lot easier to justify that sort of breakage if we can say "QUALIFY is reserved according to SQL:20xx, so don't blame us". * I'm not exactly convinced that the committee would standardize it just like this. For one thing, QUALIFY is not even the right part of speech: it's a verb, and thus more fit to be a primary statement keyword. What you need here is an adverb (I think ... been a long time since high school English, but my dictionary says WHERE is an adverb). Maybe they'd be persuaded to do what the existing implementations did, but I wouldn't be at all surprised if they choose a different keyword. regards, tom lane
On Mon, 21 Jul 2025, 18:31 Tom Lane, <tgl@sss.pgh.pa.us> wrote:
"Matheus Alcantara" <matheusssilv97@gmail.com> writes:
> You're right — semantically, using QUALIFY is equivalent to wrapping the
> query in a subquery and applying a WHERE clause to the result. The main
> motivation here is to provide a more ergonomic and readable syntax.
> While I understand the hesitation around introducing another keyword
> that effectively acts like WHERE at a different stage, I believe QUALIFY
> improves clarity in many use cases, by avoiding the boilerplate and
> visual noise of nested subqueries making it easier to write and reason
> about.
There are concrete reasons not to do this until/unless it becomes
standardized:
* If the syntax is like WHERE, there will be no way to do it without
making QUALIFY a fully-reserved word. That will inevitably break
more than zero applications. It's a lot easier to justify that
sort of breakage if we can say "QUALIFY is reserved according to
SQL:20xx, so don't blame us".
* I'm not exactly convinced that the committee would standardize
it just like this. For one thing, QUALIFY is not even the right
part of speech: it's a verb, and thus more fit to be a primary
statement keyword. What you need here is an adverb (I think ...
been a long time since high school English, but my dictionary
says WHERE is an adverb). Maybe they'd be persuaded to do what
the existing implementations did, but I wouldn't be at all surprised
if they choose a different keyword.
I know we're not bikeshedding, but the word REFINE might be more appropriate.
Thom
On Mon, Jul 21, 2025 at 9:32 PM Thom Brown <thom@linux.com> wrote: >> * I'm not exactly convinced that the committee would standardize >> it just like this. For one thing, QUALIFY is not even the right >> part of speech: it's a verb, and thus more fit to be a primary >> statement keyword. What you need here is an adverb (I think ... >> been a long time since high school English, but my dictionary >> says WHERE is an adverb). Maybe they'd be persuaded to do what >> the existing implementations did, but I wouldn't be at all surprised >> if they choose a different keyword. > > > I know we're not bikeshedding, but the word REFINE might be more appropriate. I vote for DONTGIMMEDAT. .m
On 21/07/2025 16:41, Tom Lane wrote: > Isaac Morland <isaac.morland@gmail.com> writes: >> I'll be honest, I'm skeptical that we need another keyword that basically >> means “WHERE, but applied at a different point in the query processing”. > That was my reaction too. I'm especially skeptical that getting out > front of the SQL standards committee is a good thing to do. If and > when this shows up in the standard, then sure. It's "when", not "if". I submitted a paper for this to the committee two years ago, but it was just a discussion paper and not an actual change proposal. I have recently revived that paper so hopefully it will be accepted within the next year. I would even like to push so that we have it in 19. -- Vik Fearing
On 21/07/2025 14:47, Matheus Alcantara wrote: > Hi all, > > I'm sending a proof-of-concept patch to add support for the QUALIFY > clause in Postgres. This feature allows filtering rows after window > functions are computed, using a syntax similar to the WHERE or HAVING > clauses. I took a very brief look at this, and I think your grammar is wrong. The QUALIFY clause should go after the WINDOW clause, just like FROM/WHERE and GROUP BY/HAVING. That is what I am proposing to the standards committee, and I already have some buy-in for that. -- Vik Fearing
On 21/07/2025 19:30, Tom Lane wrote: > "Matheus Alcantara" <matheusssilv97@gmail.com> writes: >> You're right — semantically, using QUALIFY is equivalent to wrapping the >> query in a subquery and applying a WHERE clause to the result. The main >> motivation here is to provide a more ergonomic and readable syntax. >> While I understand the hesitation around introducing another keyword >> that effectively acts like WHERE at a different stage, I believe QUALIFY >> improves clarity in many use cases, by avoiding the boilerplate and >> visual noise of nested subqueries making it easier to write and reason >> about. > There are concrete reasons not to do this until/unless it becomes > standardized: > > * If the syntax is like WHERE, there will be no way to do it without > making QUALIFY a fully-reserved word. That will inevitably break > more than zero applications. It's a lot easier to justify that > sort of breakage if we can say "QUALIFY is reserved according to > SQL:20xx, so don't blame us". Yes, it will need to be reserved. > * I'm not exactly convinced that the committee would standardize > it just like this. For one thing, QUALIFY is not even the right > part of speech: it's a verb, and thus more fit to be a primary > statement keyword. What you need here is an adverb (I think ... > been a long time since high school English, but my dictionary > says WHERE is an adverb). Maybe they'd be persuaded to do what > the existing implementations did, but I wouldn't be at all surprised > if they choose a different keyword. I am pretty sure that the keyword will be QUALIFY. There are just too many existing implementations for the standard to go against them all. (Also, another rdbms just implemented it that way in their upcoming product.) I agree that we should hold back until the standard accepts it, but having a working patch ready to go seems like a good idea. -- Vik Fearing
On Mon, Jul 21, 2025 at 10:26:51PM +0200, Vik Fearing wrote: > On 21/07/2025 19:30, Tom Lane wrote: > > * I'm not exactly convinced that the committee would standardize > > it just like this. For one thing, QUALIFY is not even the right > > part of speech: it's a verb, and thus more fit to be a primary > > statement keyword. What you need here is an adverb (I think ... > > been a long time since high school English, but my dictionary > > says WHERE is an adverb). Maybe they'd be persuaded to do what > > the existing implementations did, but I wouldn't be at all surprised > > if they choose a different keyword. Or a gerund, which is what HAVING is. Or a conjugated verb or something like QUALIFIED BY, though really "qualif*" seems just wrong. This is just another name for a WHERE that, like HAVING is paired with some other language feature (like GROUP BY) and applies to that clause. I don't have a better keyword(s) to offer, just sadness. > I am pretty sure that the keyword will be QUALIFY. There are just too many > existing implementations for the standard to go against them all. (Also, > another rdbms just implemented it that way in their upcoming product.) Bummer.
On Mon Jul 21, 2025 at 5:23 PM -03, Vik Fearing wrote: > > On 21/07/2025 14:47, Matheus Alcantara wrote: >> Hi all, >> >> I'm sending a proof-of-concept patch to add support for the QUALIFY >> clause in Postgres. This feature allows filtering rows after window >> functions are computed, using a syntax similar to the WHERE or HAVING >> clauses. > > > I took a very brief look at this, and I think your grammar is wrong. > The QUALIFY clause should go after the WINDOW clause, just like > FROM/WHERE and GROUP BY/HAVING. > > > That is what I am proposing to the standards committee, and I already > have some buy-in for that. > Thank you for the brief review and for the comments! I'm not sure if I fully understand but please see the new attached version. Thanks, -- Matheus Alcantara
Вложения
On 21/07/2025 23:29, Matheus Alcantara wrote: > On Mon Jul 21, 2025 at 5:23 PM -03, Vik Fearing wrote: >> On 21/07/2025 14:47, Matheus Alcantara wrote: >>> Hi all, >>> >>> I'm sending a proof-of-concept patch to add support for the QUALIFY >>> clause in Postgres. This feature allows filtering rows after window >>> functions are computed, using a syntax similar to the WHERE or HAVING >>> clauses. >> >> I took a very brief look at this, and I think your grammar is wrong. >> The QUALIFY clause should go after the WINDOW clause, just like >> FROM/WHERE and GROUP BY/HAVING. >> >> >> That is what I am proposing to the standards committee, and I already >> have some buy-in for that. >> > Thank you for the brief review and for the comments! > > I'm not sure if I fully understand but please see the new attached > version. That is my preferred grammar, thank you. I have not looked at the C code by this can be obtained with a syntax transformation. To wit: SELECT a, b, c FROM tab QUALIFY wf() OVER () = ? can be rewritten as: SELECT a, b, c FROM ( SELECT a, b, c, wf() OVER () = ? AS qc FROM tab ) AS q WHERE qc and then let the optimizer take over. The standard does this kind of thing all over the place; I don't know what the postgres project's position on doing things like this are. -- Vik Fearing
Vik Fearing <vik@postgresfriends.org> writes: > That is my preferred grammar, thank you. I have not looked at the C > code by this can be obtained with a syntax transformation. To wit: > SELECT a, b, c > FROM tab > QUALIFY wf() OVER () = ? > can be rewritten as: > SELECT a, b, c > FROM ( > SELECT a, b, c, wf() OVER () = ? AS qc > FROM tab > ) AS q > WHERE qc That answers another question I was going to raise. Matheus's opening example was SELECT depname, empno, salary, RANK() OVER (PARTITION BY depname ORDER BY salary DESC) AS rnk FROM empsalary QUALIFY rnk = 1; which implies that the QUALIFY clause sees the SELECT output columns, and hence that it can't use any values not emitted by the SELECT list. Your transformation implies that it sees the same namespace as the SELECT list, which seems like a much better and less confusing definition to me. regards, tom lane
Hi
út 22. 7. 2025 v 0:12 odesílatel Vik Fearing <vik@postgresfriends.org> napsal:
On 21/07/2025 23:29, Matheus Alcantara wrote:
> On Mon Jul 21, 2025 at 5:23 PM -03, Vik Fearing wrote:
>> On 21/07/2025 14:47, Matheus Alcantara wrote:
>>> Hi all,
>>>
>>> I'm sending a proof-of-concept patch to add support for the QUALIFY
>>> clause in Postgres. This feature allows filtering rows after window
>>> functions are computed, using a syntax similar to the WHERE or HAVING
>>> clauses.
>>
>> I took a very brief look at this, and I think your grammar is wrong.
>> The QUALIFY clause should go after the WINDOW clause, just like
>> FROM/WHERE and GROUP BY/HAVING.
>>
>>
>> That is what I am proposing to the standards committee, and I already
>> have some buy-in for that.
>>
> Thank you for the brief review and for the comments!
>
> I'm not sure if I fully understand but please see the new attached
> version.
That is my preferred grammar, thank you. I have not looked at the C
code by this can be obtained with a syntax transformation. To wit:
SELECT a, b, c
FROM tab
QUALIFY wf() OVER () = ?
can be rewritten as:
SELECT a, b, c
FROM (
SELECT a, b, c, wf() OVER () = ? AS qc
FROM tab
) AS q
WHERE qc
and then let the optimizer take over. The standard does this kind of
thing all over the place; I don't know what the postgres project's
position on doing things like this are.
just for curiosity - why the HAVING clause was not used?
Any window functions are +/- an "aggregate" function, and then HAVING looks more natural to me.
Regards
Pavel
--
Vik Fearing
On Mon, Jul 21, 2025 at 9:19 PM Pavel Stehule <pavel.stehule@gmail.com> wrote:
just for curiosity - why the HAVING clause was not used?Any window functions are +/- an "aggregate" function, and then HAVING looks more natural to me.
Hm, HAVING requires to apply 'group by' which windows functions do not require (unlike aggregates).
superuser@postgres=# select * from (select 1 as v) q having true limit 1;
ERROR: column "q.v" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: select * from (select 1 as v) q having true limit 1;
ERROR: column "q.v" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: select * from (select 1 as v) q having true limit 1;
If a query has both window function and grouped aggregate, HAVING would be applying at different grains potentially? If so, seems sus.
merlin
On Mon, Jul 21, 2025 at 09:43:15PM -0600, Merlin Moncure wrote: > On Mon, Jul 21, 2025 at 9:19 PM Pavel Stehule <pavel.stehule@gmail.com> > wrote: > > just for curiosity - why the HAVING clause was not used? > > > > Any window functions are +/- an "aggregate" function, and then HAVING > > looks more natural to me. > > Hm, HAVING requires to apply 'group by' which windows functions do not > require (unlike aggregates). Pavel's point is precisely to allow HAVING w/o a GROUP BY when there are window functions since window functions are "+/-" ("more or less") aggregate functions. That makes sense to me. > superuser@postgres=# select * from (select 1 as v) q having true limit 1; > ERROR: column "q.v" must appear in the GROUP BY clause or be used in an > aggregate function > LINE 1: select * from (select 1 as v) q having true limit 1; > > If a query has both window function and grouped aggregate, HAVING would be > applying at different grains potentially? If so, seems sus. I would have a HAVING clause that comes _before_ GROUP BY apply to window functions and a second one that comes _after_ GROUP BY apply to the grouping. Nico --
On Mon, Jul 21, 2025 at 10:08 PM Nico Williams <nico@cryptonector.com> wrote:
On Mon, Jul 21, 2025 at 09:43:15PM -0600, Merlin Moncure wrote:
> On Mon, Jul 21, 2025 at 9:19 PM Pavel Stehule <pavel.stehule@gmail.com>
> wrote:
> > just for curiosity - why the HAVING clause was not used?
> >
> > Any window functions are +/- an "aggregate" function, and then HAVING
> > looks more natural to me.
>
> Hm, HAVING requires to apply 'group by' which windows functions do not
> require (unlike aggregates).
Pavel's point is precisely to allow HAVING w/o a GROUP BY when there are
window functions since window functions are "+/-" ("more or less")
aggregate functions. That makes sense to me.
> superuser@postgres=# select * from (select 1 as v) q having true limit 1;
> ERROR: column "q.v" must appear in the GROUP BY clause or be used in an
> aggregate function
> LINE 1: select * from (select 1 as v) q having true limit 1;
>
> If a query has both window function and grouped aggregate, HAVING would be
> applying at different grains potentially? If so, seems sus.
I would have a HAVING clause that comes _before_ GROUP BY apply to
window functions and a second one that comes _after_ GROUP BY apply to
the grouping.
I don't know...consider:
#1 SELECT lag(1) OVER(), max(v) FROM (SELECT 1 AS v) q HAVING true ;
...
#2 SELECT lag(1) OVER(), max(v) FROM (SELECT 1 AS v) q HAVING true HAVING lag(1) OVER() IS NULL;
What does the HAVING clause apply to in #1? I think you might be in trouble with the standard here. 2nd clause doesn't feel right in #2. The basic problem is that HAVING does more than just 'syntax sugar subquery / WHERE' and it just can't be hijacked to do something else IMO.
Syntax simplifying
SELECT * FROM (<window function query>) WHERE col = x
Does have some merit, but implementing non-standard syntax has risks, especially in this area of the grammar. If you did do it, I'd vote for QUALIFY since implementation consensus seems to influence the standard to some degree, but I have to unfortunately +1 the 'reserved word' warning. You could probably work around that with more complex syntax but that kind of defeats the point.
merlin
Nico Williams <nico@cryptonector.com> writes: > On Mon, Jul 21, 2025 at 09:43:15PM -0600, Merlin Moncure wrote: >> Hm, HAVING requires to apply 'group by' which windows functions do not >> require (unlike aggregates). > Pavel's point is precisely to allow HAVING w/o a GROUP BY when there are > window functions since window functions are "+/-" ("more or less") > aggregate functions. That makes sense to me. No, it's really quite wrong. Aggregate functions are not equivalent to window functions: if you have both in a query, they execute in separate passes, with the window functions operating on the grouped rows output by the aggregation step (and then filtered by HAVING, if any). If we're going to support this, it does need to be its own clause. regards, tom lane
On Mon Jul 21, 2025 at 7:11 PM -03, Vik Fearing wrote: > That is my preferred grammar, thank you. > Thanks for confirming! > I have not looked at the C code by this can be obtained with a syntax > transformation. To wit: > SELECT a, b, c > FROM tab > QUALIFY wf() OVER () = ? > > > can be rewritten as: > > > SELECT a, b, c > FROM ( > SELECT a, b, c, wf() OVER () = ? AS qc > FROM tab > ) AS q > WHERE qc > > > and then let the optimizer take over. The standard does this kind of > thing all over the place; I don't know what the postgres project's > position on doing things like this are. > The current patch supports the following syntaxes: SELECT a, b, c wf() OVER () as d FROM tab QUALIFY d = 1 and SELECT a, b, c wf() OVER () FROM tab QUALIFY wf() OVER () = 1 When using the "QUALIFY d = 1" form, I currently rewrite the expression as "wf() OVER () = 1" by searching the targetlist for the matching alias, replacing the Var with the corresponding WindowFunc. Then I append this clause to the topqual, which is later assigned to WindowAggPath.plan->qual in create_one_window_path(). Besides this approach works I'm almost sure that this is not correct because searching the window function on targetlist doesen't seems correct to me. Tom also pointed out that this design could be confusing, which reinforces the need to rethink it. This transformation that you've suggested seems a better approach to handle the QUALIFY clause to me as well. Unless anyone objects, I'll prepare the next patch version based on that strategy. Thanks very much for the comments! -- Matheus Alcantara
On Mon, Jul 21, 2025 at 11:02:36PM -0600, Merlin Moncure wrote: > On Mon, Jul 21, 2025 at 10:08 PM Nico Williams <nico@cryptonector.com> > wrote: > > I would have a HAVING clause that comes _before_ GROUP BY apply to > > window functions and a second one that comes _after_ GROUP BY apply to > > the grouping. > > I don't know...consider: > #1 SELECT lag(1) OVER(), max(v) FROM (SELECT 1 AS v) q HAVING true ; > ... > #2 SELECT lag(1) OVER(), max(v) FROM (SELECT 1 AS v) q HAVING true HAVING > lag(1) OVER() IS NULL; > > What does the HAVING clause apply to in #1? I think you might be in > trouble with the standard here. 2nd clause doesn't feel right in #2. The > basic problem is that HAVING does more than just 'syntax sugar subquery / > WHERE' and it just can't be hijacked to do something else IMO. #2 would be a syntax error because the second HAVING did not come after a GROUP BY. #1 would not be a syntax error only because of the use of window functions before the HAVING. > Syntax simplifying > SELECT * FROM (<window function query>) WHERE col = x Yes. I'd rather that than QUALIFY. QUALIFY only makes sense because so many other RDBMSes have it and it's likely to get standardized. Nico --
On Tue, Jul 22, 2025 at 01:14:20AM -0400, Tom Lane wrote: > Nico Williams <nico@cryptonector.com> writes: > > On Mon, Jul 21, 2025 at 09:43:15PM -0600, Merlin Moncure wrote: > >> Hm, HAVING requires to apply 'group by' which windows functions do not > >> require (unlike aggregates). > > > Pavel's point is precisely to allow HAVING w/o a GROUP BY when there are > > window functions since window functions are "+/-" ("more or less") > > aggregate functions. That makes sense to me. > > No, it's really quite wrong. Aggregate functions are not equivalent > to window functions: if you have both in a query, they execute in > separate passes, with the window functions operating on the grouped > rows output by the aggregation step (and then filtered by HAVING, > if any). Pavel doesn't say that window functions are aggregate functions. Pavel said they are +/- (more or less, really, just similar to) aggregate functions. There is a similarity. But I appreciate the point about which passes get which, and that definitely makes the two-HAVING- clauses concept much more unwieldy. > If we're going to support this, it does need to be its own clause. I agree that its own clause is best; I just greatly dislike QUALIFY.
I often accidentally write SELECT .. WHERE .. WHERE ..; which is obviously wrong, but what I mean when I do this is SELECT .. WHERE .. AND ..; and if I wrote GROUP BY .. HAVING queries as often as I do ones that don't GROUP BY then I'd probably also accidentally use extra HAVINGs as ANDs. It doesn't seem too crazy that extra WHEREs in WHERE clauses should some day function as ANDs, and ditto HAVINGs, which is another reason not to reuse HAVING for this: just to leave that a possibility, remote though it might be. My advice is to wait till QUALIFY is standardized, then hold your nose and adopt it, or maybe sooner when it becomes clear that it will be standardized (because so many other RDBMSes have it too).
On 22/07/2025 17:07, Nico Williams wrote: > On Tue, Jul 22, 2025 at 01:14:20AM -0400, Tom Lane wrote: >> Nico Williams <nico@cryptonector.com> writes: >>> On Mon, Jul 21, 2025 at 09:43:15PM -0600, Merlin Moncure wrote: >>>> Hm, HAVING requires to apply 'group by' which windows functions do not >>>> require (unlike aggregates). >>> Pavel's point is precisely to allow HAVING w/o a GROUP BY when there are >>> window functions since window functions are "+/-" ("more or less") >>> aggregate functions. That makes sense to me. >> No, it's really quite wrong. Aggregate functions are not equivalent >> to window functions: if you have both in a query, they execute in >> separate passes, with the window functions operating on the grouped >> rows output by the aggregation step (and then filtered by HAVING, >> if any). > Pavel doesn't say that window functions are aggregate functions. Pavel > said they are +/- (more or less, really, just similar to) aggregate > functions. There is a similarity. But I appreciate the point about > which passes get which, and that definitely makes the two-HAVING- > clauses concept much more unwieldy. Window functions and aggregates have only one thing in common, and that is that they can both operate on a window frame. Otherwise the difference is night and day. Especially when you consider nested window clauses (that postgres does not support yet). > I agree that its own clause is best; I just greatly dislike QUALIFY. Sorry. -- Vik Fearing
On 22/07/2025 17:14, Nico Williams wrote:
It doesn't seem too crazy that extra WHEREs in WHERE clauses should some day function as ANDs, and ditto HAVINGs, which is another reason not to reuse HAVING for this: just to leave that a possibility, remote though it might be.
I have a firm finger on the pulse of the standards committee, and I can guarantee that multiple WHERE clauses will never replace AND until certain people cross the river Styx. Myself included.
My advice is to wait till QUALIFY is standardized, then hold your nose and adopt it, or maybe sooner when it becomes clear that it will be standardized (because so many other RDBMSes have it too).
Good advice.
--
Vik Fearing
On 2025-07-22 Tu 11:14 AM, Vik Fearing wrote: > > >> I agree that its own clause is best; I just greatly dislike QUALIFY. > > > Sorry. > If we were making up our own syntax this would be a sensible thing to debate. If we're talking about implementing something we expect to be in the standard, I think we will have to live with what the standards committee decides, regardless of our preference. We've almost certainly been preempted here by other RDBMSs using QUALIFY, heedless of English grammar. cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
Em ter., 22 de jul. de 2025 às 08:56, Matheus Alcantara <matheusssilv97@gmail.com> escreveu:
The current patch supports the following syntaxes:
SELECT a, b, c
wf() OVER () as d
FROM tab
QUALIFY d = 1
When using the "QUALIFY d = 1" form, I currently rewrite the expression
as "wf() OVER () = 1" by searching the targetlist for the matching
alias, replacing the Var with the corresponding WindowFunc.
Not related to $subject but that way you did,
that replacement alias for wf expression, is cool.
With that would it be possible to have where replacements too ? Maybe.
select a+b as ab from t where ab = 5
regards
Marcos
On Tue Jul 22, 2025 at 3:11 PM -03, Marcos Pegoraro wrote: > Em ter., 22 de jul. de 2025 às 08:56, Matheus Alcantara < > matheusssilv97@gmail.com> escreveu: > >> The current patch supports the following syntaxes: >> SELECT a, b, c >> wf() OVER () as d >> FROM tab >> QUALIFY d = 1 >> When using the "QUALIFY d = 1" form, I currently rewrite the expression >> as "wf() OVER () = 1" by searching the targetlist for the matching >> alias, replacing the Var with the corresponding WindowFunc. >> > > Not related to $subject but that way you did, > that replacement alias for wf expression, is cool. > With that would it be possible to have where replacements too ? Maybe. > > select a+b as ab from t where ab = 5 > Do you mean instead of reference the "ab" using a Var it replaces with a OpExpr of "a+b"? I think that it would be possible, but the current implementation is not fully correct, it only works for OpExpr's. Using AND/OR operators does not work and I think that to make it fully correct is more complex, so I think that rewriting the query to use a subquery as Vik suggested on [1] fixes this issue and also makes it easier to understand. (this happens on transformQualifyClause() if you want to take a look) [1] https://www.postgresql.org/message-id/6c998e4f-f6f2-43c2-8b67-cfff360ef241%40postgresfriends.org -- Matheus Alcantara
On 22/07/2025 20:54, Matheus Alcantara wrote: > (this happens on transformQualifyClause() if you want to take a look) I took a quick look at the patch (without applying and testing it) and it seems to me that parse analysis is the wrong place to do this. We want ruleutils to be able to spew out the QUALIFY clause as written in a view and not as transformed. If we are going to go down the syntax transformation route, that should happen in the rewriter at planning/execution time. -- Vik Fearing
On Wed, 23 Jul 2025 at 09:21, Vik Fearing <vik@postgresfriends.org> wrote: > I took a quick look at the patch (without applying and testing it) and > it seems to me that parse analysis is the wrong place to do this. We > want ruleutils to be able to spew out the QUALIFY clause as written in a > view and not as transformed. If we are going to go down the syntax > transformation route, that should happen in the rewriter at > planning/execution time. I agree with Tom on not jumping the gun on the standard thing, but if that does materialise one day, then whichever method is used, you'd still want the same pushdown optimisations to happen that currently happen with qual pushdown into subqueries. Looking at the latest patch I see that pushdowns don't work: # explain select row_number() over (order by oid) rb from pg_Class qualify row_number () over (order by oid)=1; # explain (analyze, costs off, buffers off, summary off) select row_number() over (order by oid) rb from pg_Class qualify row_number () over (order by oid)=1; WindowAgg (actual time=0.041..0.273 rows=1.00 loops=1) Window: w1 AS (ORDER BY oid ROWS UNBOUNDED PRECEDING) Filter: (row_number() OVER w1 = 1) Rows Removed by Filter: 415 Storage: Memory Maximum Storage: 17kB -> Index Only Scan using pg_class_oid_index on pg_class (actual time=0.032..0.125 rows=416.00 loops=1) Heap Fetches: 0 Index Searches: 1 Whereas, with a subquery we get: # explain (analyze, costs off, buffers off, summary off) select * from (select row_number() over (order by oid) rn from pg_class) r where r.rn=1; Subquery Scan on r (actual time=0.042..0.044 rows=1.00 loops=1) Filter: (r.rn = 1) -> WindowAgg (actual time=0.041..0.043 rows=1.00 loops=1) Window: w1 AS (ORDER BY pg_class.oid ROWS UNBOUNDED PRECEDING) Run Condition: (row_number() OVER w1 <= 1) Storage: Memory Maximum Storage: 17kB -> Index Only Scan using pg_class_oid_index on pg_class (actual time=0.030..0.031 rows=2.00 loops=1) Heap Fetches: 0 Index Searches: 1 Also, this seems busted: # select row_number() over (order by oid) rn from pg_class qualify rn=1; server closed the connection unexpectedly David
On 2025-Jul-22, Andrew Dunstan wrote: > If we were making up our own syntax this would be a sensible thing to > debate. If we're talking about implementing something we expect to be in the > standard, I think we will have to live with what the standards committee > decides, regardless of our preference. We've almost certainly been preempted > here by other RDBMSs using QUALIFY, heedless of English grammar. The Romans, the Vikings, the Normans, all have influenced the English language. Why not SQL? -- Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/ “Cuando no hay humildad las personas se degradan” (A. Christie)
On Mon Jul 21, 2025 at 7:11 PM -03, Vik Fearing wrote: > That is my preferred grammar, thank you. I have not looked at the C > code by this can be obtained with a syntax transformation. To wit: > > > SELECT a, b, c > FROM tab > QUALIFY wf() OVER () = ? > > > can be rewritten as: > > > SELECT a, b, c > FROM ( > SELECT a, b, c, wf() OVER () = ? AS qc > FROM tab > ) AS q > WHERE qc > > > and then let the optimizer take over. The standard does this kind of > thing all over the place; I don't know what the postgres project's > position on doing things like this are. > With this transformation users will see a Subquery plan node even if it's not present on the original query, is that expected or it can be confusing to users? -- Matheus Alcantara
On 25/07/2025 14:55, Matheus Alcantara wrote: > On Mon Jul 21, 2025 at 7:11 PM -03, Vik Fearing wrote: >> SELECT a, b, c >> FROM tab >> QUALIFY wf() OVER () = ? >> >> >> can be rewritten as: >> >> >> SELECT a, b, c >> FROM ( >> SELECT a, b, c, wf() OVER () = ? AS qc >> FROM tab >> ) AS q >> WHERE qc >> >> >> and then let the optimizer take over. The standard does this kind of >> thing all over the place; I don't know what the postgres project's >> position on doing things like this are. > With this transformation users will see a Subquery plan node even if > it's not present on the original query, is that expected or it can be > confusing to users? This is a definition technique, it does not need to be implemented as a subquery. -- Vik Fearing
On 22/07/25 19:32, David Rowley wrote: > Looking at the latest patch I see that pushdowns don't work: > > # explain select row_number() over (order by oid) rb from pg_Class > qualify row_number () over (order by oid)=1; > > # explain (analyze, costs off, buffers off, summary off) select > row_number() over (order by oid) rb from pg_Class qualify row_number > () over (order by oid)=1; > > WindowAgg (actual time=0.041..0.273 rows=1.00 loops=1) > Window: w1 AS (ORDER BY oid ROWS UNBOUNDED PRECEDING) > Filter: (row_number() OVER w1 = 1) > Rows Removed by Filter: 415 > Storage: Memory Maximum Storage: 17kB > -> Index Only Scan using pg_class_oid_index on pg_class (actual > time=0.032..0.125 rows=416.00 loops=1) > Heap Fetches: 0 > Index Searches: 1 > > Whereas, with a subquery we get: > > # explain (analyze, costs off, buffers off, summary off) > select * from (select row_number() over (order by oid) rn from > pg_class) r where r.rn=1; > > Subquery Scan on r (actual time=0.042..0.044 rows=1.00 loops=1) > Filter: (r.rn = 1) > -> WindowAgg (actual time=0.041..0.043 rows=1.00 loops=1) > Window: w1 AS (ORDER BY pg_class.oid ROWS UNBOUNDED PRECEDING) > Run Condition: (row_number() OVER w1 <= 1) > Storage: Memory Maximum Storage: 17kB > -> Index Only Scan using pg_class_oid_index on pg_class > (actual time=0.030..0.031 rows=2.00 loops=1) > Heap Fetches: 0 > Index Searches: 1 > By "pushdowns" you mean missing the Run Conditions on the QUALIFY example? IIUC the Run Condition is only created if it's a subquery. I've checked this on set_rel_size() -> set_subquery_pathlist() -> check_and_push_window_quals(). > Also, this seems busted: > > # select row_number() over (order by oid) rn from pg_class qualify rn=1; > server closed the connection unexpectedly > Thanks for testing! I'm working on this and some other issues. -- Matheus Alcantara
On Tue, 29 Jul 2025 at 12:11, Matheus Alcantara <matheusssilv97@gmail.com> wrote: > By "pushdowns" you mean missing the Run Conditions on the QUALIFY > example? IIUC the Run Condition is only created if it's a subquery. I've > checked this on set_rel_size() -> set_subquery_pathlist() -> > check_and_push_window_quals(). Yes, but not only Run Conditions, it's subquery pushdown quals in general. There are various rules to what is allowed and what must be disallowed. See check_output_expressions(). You should be pushing the qual to the lowest level that it's valid to evaluate it at. We do this already for HAVING quals where those will effectively be "transferred" to the WHERE clause when it's valid to do so. I'd expect the same for QUALIFY. I'm unsure which parts of subquery pushdown could be made more reusable to help you with this. Ideally we'd not have to duplicate lots of logic in various places. If you do manage to get around the whole SQL standard issue around QUALIFY, then a large portion of a patch like this being acceptable will largely depend on how much code gets reused vs how much you have to rewrite from scratch. It's not that fun to have to duplicate logic in multiple places when new optimisations are added. See d222585a9 for an example of an optimisation that would likely have to be duplicated if QUALIFY existed. David
On Tue, Jul 29, 2025 at 9:47 AM David Rowley <dgrowleyml@gmail.com> wrote: > You should be pushing the qual to the lowest level that it's valid to > evaluate it at. We do this already for HAVING quals where those will > effectively be "transferred" to the WHERE clause when it's valid to do > so. I'd expect the same for QUALIFY. Yeah, we should have the same kind of optimization for a QUALIFY clause as we do for HAVING - pushing it down to WHERE when possible. One condition for doing this is that the QUALIFY clause does not reference any columns that are not present in the window PARTITION BY clauses. There may be other conditions under which we can have other optimizations, such as pushing down a QUALIFY clause as run conditions. I think we should at a minimum inherit all existing optimizations for window functions used in subqueries; otherwise, it will be difficult to convince others to accept this patch. Thanks Richard