Re: Row pattern recognition

Поиск
Список
Период
Сортировка
От Tatsuo Ishii
Тема Re: Row pattern recognition
Дата
Msg-id 20230728.160953.1112305070052061571.t-ishii@sranhm.sra.co.jp
обсуждение исходный текст
Ответ на Re: Row pattern recognition  (Vik Fearing <vik@postgresfriends.org>)
Ответы Re: Row pattern recognition  (Vik Fearing <vik@postgresfriends.org>)
Список pgsql-hackers
>> We already recalculate a frame each time a row is processed even
>> without RPR. See ExecWindowAgg.
> 
> Yes, after each row.  Not for each function.

Ok, I understand now. Closer look at the code, I realized that each
window function calls update_frameheadpos, which computes the frame
head position. But actually it checks winstate->framehead_valid and if
it's already true (probably by other window function), then it does
nothing.

>> Also RPR always requires a frame option ROWS BETWEEN CURRENT ROW,
>> which means the frame head is changed each time current row position
>> changes.
> 
> Off topic for now: I wonder why this restriction is in place and
> whether we should respect or ignore it.  That is a discussion for
> another time, though.

My guess is, it is because other than ROWS BETWEEN CURRENT ROW has
little or no meaning. Consider following example:

SELECT i, first_value(i) OVER w
  FROM (VALUES (1), (2), (3), (4)) AS v (i)
  WINDOW w AS (
   ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
   AFTER MATCH SKIP PAST LAST ROW
   PATTERN (A)
   DEFINE
    A AS i = 1 OR i = 3
);

In this example ROWS BETWEEN CURRENT ROW gives frames with i = 1 and i
= 3.

 i | first_value 
---+-------------
 1 |           1
 2 |            
 3 |           3
 4 |            
(4 rows)

But what would happen with ROWS BETWEEN UNBOUNDED PRECEDING AND
UNBOUNDED FOLLOWING?  Probably the frame i = 3 will be missed as
at i = 2, PATTERN is not satisfied and compution of the reduced frame
stops.

 i | first_value 
---+-------------
 1 |           1
 2 |            
 3 |           
 4 |            
(4 rows)

This is not very useful for users.

>>> I strongly disagree with this.  Window function do not need to know
>>> how the frame is defined, and indeed they should not.
>> We already break the rule by defining *support functions. See
>> windowfuncs.c.
> The support functions don't know anything about the frame, they just
> know when a window function is monotonically increasing and execution
> can either stop or be "passed through".

I see following code in window_row_number_support:

        /*
         * The frame options can always become "ROWS BETWEEN UNBOUNDED
         * PRECEDING AND CURRENT ROW".  row_number() always just increments by
         * 1 with each row in the partition.  Using ROWS instead of RANGE
         * saves effort checking peer rows during execution.
         */
        req->frameOptions = (FRAMEOPTION_NONDEFAULT |
                             FRAMEOPTION_ROWS |
                             FRAMEOPTION_START_UNBOUNDED_PRECEDING |
                             FRAMEOPTION_END_CURRENT_ROW);

I think it not only knows about frame but it even changes the frame
options. This seems far from "don't know anything about the frame", no?

> I have two comments about this:
> 
> It isn't just for convenience, it is for correctness.  The window
> functions do not need to know which rows they are *not* operating on.
> 
> There is no such thing as a "full" or "reduced" frame.  The standard
> uses those terms to explain the difference between before and after
> RPR is applied, but window functions do not get to choose which frame
> they apply over.  They only ever apply over the reduced window frame.

I agree that "full window frame" and "reduced window frame" do not
exist at the same time, and in the end (after computation of reduced
frame), only "reduced" frame is visible to window
functions/aggregates. But I still do think that "full window frame"
and "reduced window frame" are important concept to explain/understand
how PRP works.

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Amit Langote
Дата:
Сообщение: Re: remaining sql/json patches
Следующее
От: Bharath Rupireddy
Дата:
Сообщение: Re: Support to define custom wait events for extensions