Обсуждение: Re: Row pattern recognition

Поиск

Список

Период

Сортировка

Re: Row pattern recognition

От

Tatsuo Ishii

Дата:

13 марта 2025 г., 15:51:29

Attached are the v29 patch sets to implement the subset of row pattern
recognition feature defined in the SQL standard.

In this patch set:

- Adjust 0003 and 0004 to deal with the recent changes made by 8b1b342.
 
- Add tests to 0007 for CREATE VIEW/pg_get_viewdef.

Best reagards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

Вложения

Re: Row pattern recognition

От

Tatsuo Ishii

Дата:

23 апреля 2025 г., 02:55:33

Attached are the v30 patches, just adding a patch to change the
default io_method parameter to sync, expecting this affects something
to the recent CFbot failure.
https://commitfest.postgresql.org/patch/4460/
https://cirrus-ci.com/task/6078653164945408
which is similar to:
https://www.postgresql.org/message-id/20250422.111139.1502127917165838403.ishii%40postgresql.org

(Once the issue resolved, the patch should be removed, of course)

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

Вложения

Re: Row pattern recognition

От

Tatsuo Ishii

Дата:

23 апреля 2025 г., 04:07:29

> Attached are the v30 patches, just adding a patch to change the
> default io_method parameter to sync, expecting this affects something
> to the recent CFbot failure.
> https://commitfest.postgresql.org/patch/4460/
> https://cirrus-ci.com/task/6078653164945408
> which is similar to:
> https://www.postgresql.org/message-id/20250422.111139.1502127917165838403.ishii%40postgresql.org

CFBot has just run tests against v30 patches and now it turns to green
again!  I guess io_method = sync definitely did the trick.  Note that
previously only the Windows Server 2019, VS 2019 - Neason & ninja test
had failed.

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

Re: Row pattern recognition

От

Tatsuo Ishii

Дата:

06 мая 2025 г., 03:53:51

> Attached are the v30 patches, just adding a patch to change the
> default io_method parameter to sync, expecting this affects something
> to the recent CFbot failure.
> https://commitfest.postgresql.org/patch/4460/
> https://cirrus-ci.com/task/6078653164945408
> which is similar to:
> https://www.postgresql.org/message-id/20250422.111139.1502127917165838403.ishii%40postgresql.org
> 
> (Once the issue resolved, the patch should be removed, of course)

It seems this has turned to green since May 2, 2025.
https://commitfest.postgresql.org/patch/5679/.

The last time it turned to red was April 16, 2025. Maybe something
committed between the period solved the cause of red, but I don't know
exactly which commit.

Anyway v31 patches now remove the change to default io_method
parameter to see if it passes Windows Server 2019, VS 2019 - Meson &
ninja test.

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

Вложения

Re: Row pattern recognition

От

Tatsuo Ishii

Дата:

06 мая 2025 г., 10:28:42

>> Attached are the v30 patches, just adding a patch to change the
>> default io_method parameter to sync, expecting this affects something
>> to the recent CFbot failure.
>> https://commitfest.postgresql.org/patch/4460/
>> https://cirrus-ci.com/task/6078653164945408
>> which is similar to:
>> https://www.postgresql.org/message-id/20250422.111139.1502127917165838403.ishii%40postgresql.org
>> 
>> (Once the issue resolved, the patch should be removed, of course)
> 
> It seems this has turned to green since May 2, 2025.
> https://commitfest.postgresql.org/patch/5679/.
> 
> The last time it turned to red was April 16, 2025. Maybe something
> committed between the period solved the cause of red, but I don't know
> exactly which commit.
> 
> Anyway v31 patches now remove the change to default io_method
> parameter to see if it passes Windows Server 2019, VS 2019 - Meson &
> ninja test.

Now I see it passes the test.
https://cirrus-ci.com/build/5671612202090496

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

Re: Row pattern recognition

От

Tatsuo Ishii

Дата:

16 августа 2025 г., 11:45:52

Attached are the v32 patches for Row pattern recognition.  Recent
changes to doc/src/sgml/func.sgml required v31 to be rebased. Other
than that, nothing has been changed.

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

Вложения

Re: Row pattern recognition

От

Tatsuo Ishii

Дата:

24 сентября 2025 г., 13:35:23

Attached are the v33 patches for Row pattern recognition.  The
difference from v32 is that the raw parse tree printing patch is not
included in v33. This is because now that we have
debug_print_raw_parse.

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

Вложения

Re: Row pattern recognition

От

Tatsuo Ishii

Дата:

17 ноября 2025 г., 09:57:15

Attached are the v34 patches for Row pattern recognition.
Notihing has been changed. Just rebased.

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

Вложения

Re: Row pattern recognition

От

Tatsuo Ishii

Дата:

18 ноября 2025 г., 05:33:20

Attached are the v35 patches for Row pattern recognition.  In v34-0001
gram.y patch, %type for RPR were misplaced. v35-0001 fixes this. Other
patches are not changed.

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

>> I just finished reviewing 0007 and 0008. This patch set really demonstrates the full lifecycle of adding a SQL
feature,from changing the syntax in gram.y all the way down to the executor, including tests and docs. I learned a lot
fromit. Thanks!
 
> 
> You are welcome!
> 
>> 23 - 0007
>> 
>> As you mentioned earlier, pattern regular expression support *, + and ?, but I don’t see ? is tested.
> 
> Good catch. I will add the test case. In the mean time I found a bug
> with gram.y part which handles '?' quantifier.  gram.y can be
> successfully compiled but there's no way to put '?' quantifier in a
> SQL statement.  We cannot write directly "ColId '?'" like other
> quantifier '*' and '+' in the grammer because '?' is not a "self"
> token.  So we let '?' matches Op first then check if it's '?'  or
> not. 
> 
>             | ColId Op
>                 {
>                     if (strcmp("?", $2))
>                         ereport(ERROR,
>                                 errcode(ERRCODE_SYNTAX_ERROR),
>                                 errmsg("unsupported quantifier"),
>                                 parser_errposition(@2));
> 
>                     $$ = (Node *) makeSimpleA_Expr(AEXPR_OP, "?", (Node *)makeString($1), NULL, @1);
>                 }
> 
> Another bug was with executor (nodeWindowAgg.c). The code to check
> greedy quantifiers missed the case '?'.
> 
>> 24 - 0007
>> 
>> I don’t see negative tests for unsupported quantifiers, like PATTERN (A+?).
> 
> I will add such test cases.
> 
>> 25 - 0007
>> ```
>> +-- basic test with none greedy pattern
>> ```
>> 
>> Typo: “none greedy” -> non-greedy
> 
> Will fix.
> 
>> No comment for 0008.
> 
> Thanks again for the review. I will post v35 patch set soon.  Attached
> is the diff from v34.

Attached are the v35 patches for Row pattern recognition.  Chao Li
reviewed v34 thoroughly. Thank you! v35 reflects the review
comments. Major differences from v34 include:

- Make "DEFINE" an unreserved keyword. Previously it was a reserved keyword.
- Refactor transformDefineClause() to make two foreach loops into single foreach loop.
- Make '?' quantifier in PATTERN work as advertised. Test for '?' quantifier is added too.
- Unsupported quantifier test added.
- Fix get_rule_define().
- Fix memory leak related to regcomp.
- Move regcomp compiled result cache from static data to WindowAggState.
- Fix several typos.

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

Вложения

Re: Row pattern recognition

От

Tatsuo Ishii

Дата:

23 декабря 2025 г., 16:26:54

Attached are the v36 patches, just for rebasing.

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

Вложения

Re: Row pattern recognition

От

Henson Choi

Дата:

31 декабря 2025 г., 15:10:53

SQL/RPR Patch Review Report
============================

Patch: Row Pattern Recognition (SQL:2016)
Commitfest: https://commitfest.postgresql.org/patch/4460

Review Methodology:
This review focused on quality assessment, not line-by-line code audit.
Key code paths and quality issues were examined with surrounding context
when concerns arose. Documentation files were reviewed with AI-assisted
grammar and typo checking. Code coverage was measured using gcov and
custom analysis tools.

Limitations:
- Not a security audit or formal verification
- Parser and executor internals reviewed at module level, not exhaustively
- Performance characteristics not benchmarked

TABLE OF CONTENTS
-----------------

1. Executive Summary
2. Issues Found
2.1 Critical / Major
2.2 Minor Issues (Review Needed)
2.3 Minor Issues (Style)
2.4 Suggestions for Discussion
3. Test Coverage
3.1 Covered Areas
3.2 Untested Items
3.3 Unimplemented Features (No Test Needed)
4. Code Coverage Analysis
4.1 Overall Coverage
4.2 Coverage by File
4.3 Uncovered Code Risk Assessment
5. Commit Analysis
6. Recommendations
7. Conclusion

1. EXECUTIVE SUMMARY
--------------------

Overall assessment: GOOD

The SQL/RPR patch demonstrates solid implementation quality within its
defined scope. Code follows PostgreSQL coding standards (with minor style
issues), test coverage is comprehensive at 96.4%, and documentation is
thorough with only minor typos.

Rating by Area:
- Code Quality: Good (PostgreSQL style compliant, 26 static style fixes recommended)
- Test Coverage: Good (96.4% line coverage, 28 test cases)
- Documentation: Good (Complete, 1 minor typo)
- Build/Regress: Pass (make check-world, 753 tests passed)

2. ISSUES FOUND
---------------

2.1 Critical / Major

#1 [Code] Greedy pattern combinatorial explosion
Pattern: PATTERN (A+ B+ C+) with DEFINE A AS TRUE, B AS TRUE, C AS TRUE
Impact: Memory exhaustion or infinite wait
Recommendation: Add work_mem-based memory limit (error on exceed)

Evidence - No memory limit in current code:
- nodeWindowAgg.c:5718-5722 string_set_init(): palloc() unconditional
- nodeWindowAgg.c:5741-5750 string_set_add(): set_size *= 2; repalloc() unlimited
- nodeWindowAgg.c:5095-5174 generate_patterns(): Triple loop, no limit

Only work_mem usage in RPR (nodeWindowAgg.c:1297):
winstate->buffer = tuplestore_begin_heap(false, false, work_mem);
-> For tuple buffer, unrelated to pattern combination memory (StringSet)

2.2 Minor Issues (Review Needed)

#1 [Code] nodeWindowAgg.c:5849,5909,5912
pos > NUM_ALPHABETS check - intent unclear, causes error at 28 PATTERN elements

Reproduction:
- PATTERN (A B C ... Z A) (27 elements) -> OK
- PATTERN (A B C ... Z A B) (28 elements) -> ERROR "initial is not valid char: b"

2.3 Minor Issues (Style)

#1 [Code] nodeWindowAgg.c (25 blocks)
#ifdef RPR_DEBUG -> recommend elog(DEBUG2, ...) or remove

#2 [Code] src/backend/executor/nodeWindowAgg.c
static keyword on separate line (26 functions)

#3 [Code] src/backend/utils/adt/ruleutils.c
Whitespace typo: "regexp !=NULL" -> "regexp != NULL"

#4 [Code] nodeWindowAgg.c:4619
Error message case: "Unrecognized" -> "unrecognized" (PostgreSQL style)

#5 [Doc] doc/src/sgml/ref/select.sgml:1128
Typo: "maximu" -> "maximum"

2.4 Suggestions for Discussion

#1 Incremental matching with streaming NFA redesign
Benefits:
- O(n) time complexity per row (current: exponential in worst case)
- Bounded memory via state merging and context absorption
- Natural extension for OR patterns, {n,m} quantifiers, nested groups
- Enables MEASURES clause with incremental aggregates
- Proven approach in CEP engines (Flink, Esper)

3. TEST COVERAGE
----------------

3.1 Covered Areas

- PATTERN clause: +, *, ? quantifiers (line 41, 71, 86)
- DEFINE clause: Variable conditions, auto-TRUE for missing (line 120-133)
- PREV/NEXT functions: Single argument (line 44, 173)
- AFTER MATCH SKIP: TO NEXT ROW (line 182), PAST LAST ROW (line 198)
- Aggregate integration: sum, avg, count, max, min (line 258-277)
- Window function integration: first_value, last_value, nth_value (line 34-35)
- JOIN/CTE: JOIN (line 313), WITH (line 324)
- View: VIEW creation, pg_get_viewdef (line 390-406)
- Error cases: 7 error scenarios (line 409-532)
- Large partition: 5000 row smoke test (line 360-387)
- ROWS BETWEEN offset: CURRENT ROW AND 2 FOLLOWING (line 244)

3.2 Untested Items

Feature Priority Recommendation
-------------------------------------------------------------------------------
26 variable limit Medium Test 26 variables success, 27th variable error
NULL value handling Low Test PREV(col) where col or previous row is NULL

Sample test for 26 variable limit:

-- Should fail with 27th variable (parser error, no table needed)
SELECT * FROM (SELECT 1 AS x) t
WINDOW w AS (
ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
PATTERN (a b c d e f g h i j k l m n o p q r s t u v w x y z aa)
DEFINE a AS TRUE, b AS TRUE, c AS TRUE, d AS TRUE, e AS TRUE, f AS TRUE,
g AS TRUE, h AS TRUE, i AS TRUE, j AS TRUE, k AS TRUE, l AS TRUE,
m AS TRUE, n AS TRUE, o AS TRUE, p AS TRUE, q AS TRUE, r AS TRUE,
s AS TRUE, t AS TRUE, u AS TRUE, v AS TRUE, w AS TRUE, x AS TRUE,
y AS TRUE, z AS TRUE, aa AS TRUE
);
-- ERROR: number of row pattern definition variable names exceeds 26

Sample test for NULL handling:

CREATE TEMP TABLE stock_null (company TEXT, tdate DATE, price INTEGER);
INSERT INTO stock_null VALUES ('c1', '2023-07-01', 100);
INSERT INTO stock_null VALUES ('c1', '2023-07-02', NULL); -- NULL in middle
INSERT INTO stock_null VALUES ('c1', '2023-07-03', 200);
INSERT INTO stock_null VALUES ('c1', '2023-07-04', 150);

SELECT company, tdate, price, count(*) OVER w AS match_count
FROM stock_null
WINDOW w AS (
PARTITION BY company
ORDER BY tdate
ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
PATTERN (START UP DOWN)
DEFINE START AS TRUE, UP AS price > PREV(price), DOWN AS price < PREV(price)
);
-- Result: all rows show match_count = 0 (NULL breaks pattern matching)

3.3 Unimplemented Features (No Test Needed)

Per documentation (select.sgml:1133-1136):
- MEASURES clause: Not implemented
- SUBSET clause: Not implemented
- AFTER MATCH variants: Only SKIP TO NEXT ROW, PAST LAST ROW supported

Not in documentation, verified by testing:
- {n,m} quantifier: Parser error ("syntax error at or near {")
- (A|B) OR pattern: Not supported (parsed as single variable name "a|b")
- (A B)+ compound repetition: Parser accepts, but execution returns 0 matches

4. CODE COVERAGE ANALYSIS
-------------------------

4.1 Overall Coverage

96.4% (732 / 759 lines)

4.2 Coverage by File (major RPR-modified files)

nodeWindowAgg.c: 96.6% (560/580) - Pattern matching execution engine
parse_clause.c: 96.7% (88/91) - PATTERN/DEFINE analysis
ruleutils.c: 93.1% (54/58) - pg_get_viewdef output

4.3 Uncovered Code Risk Assessment

Total: 27 lines uncovered

Medium Risk (2 items) - Test addition recommended (see section 3.2):
- parse_clause.c:4093: transformDefineClause - 27th variable error
- nodeWindowAgg.c:5623: get_slots - null_slot for PREV at partition first row

Low Risk (25 items) - Defensive code / Unreachable:
- nodeWindowAgg.c:3074: attno_map_walker - Parser validates arg count
- nodeWindowAgg.c:3137-3138: rpr_navigation_walker - switch default
- nodeWindowAgg.c:3566: window_gettupleslot - Before mark position
- nodeWindowAgg.c:4289: WinGetFuncArgInFrame - isout flag
- nodeWindowAgg.c:4393: WinGetSlotInFrame - Boundary check
- nodeWindowAgg.c:4618-4619: row_is_in_reduced_frame - switch default
- nodeWindowAgg.c:4697: register_reduced_frame_map - pos < 0 check
- nodeWindowAgg.c:5007: search_str_set - NULL return continue
- nodeWindowAgg.c:5405: do_pattern_match - Regex compile error
- nodeWindowAgg.c:5435,5437-5438: do_pattern_match - Regex exec error
- nodeWindowAgg.c:5700: pattern_initial - Variable not found
- nodeWindowAgg.c:5776: string_set_get - Index range check
- nodeWindowAgg.c:5850: variable_pos_register - a-z range check
- nodeWindowAgg.c:5910: variable_pos_fetch - a-z range check
- nodeWindowAgg.c:5913: variable_pos_fetch - Index range check
- parse_clause.c:3989: transformDefineClause - A_Expr type check
- parse_clause.c:4145: transformPatternClause - A_Expr type check
- ruleutils.c:6904-6908: get_rule_windowspec - SKIP TO NEXT ROW output

Conclusion: Most uncovered code consists of defensive error handling or
code unreachable due to parser pre-validation. No security or functional risk.

5. COMMIT ANALYSIS
------------------

8 sequential commits:

Commit Area Files +/- Key Content
-------------------------------------------------------------------------------
1 Raw Parser 4 +174/-16 gram.y grammar (PATTERN/DEFINE)
2 Parse/Analysis 4 +277/-1 parse_clause.c analysis logic
3 Rewriter 1 +109/-0 pg_get_viewdef extension
4 Planner 5 +73/-3 WindowAgg plan node extension
5 Executor 4 +1,942/-11 CORE: Pattern matching engine (+1,850)
6 Docs 3 +192/-7 advanced.sgml, func-window.sgml
7 Tests 3 +1,585/-1 rpr.sql (532), rpr.out (1,052)
8 typedefs 1 +6/-0 pgindent support

Code Change Statistics:
- Total files: 25
- Lines added: 4,358
- Lines deleted: 39
- Net increase: +4,319 lines

6. RECOMMENDATIONS
------------------

6.1 Combinatorial Explosion Prevention (Major, Required)

Add work_mem-based memory limit for StringSet allocation.
Location: string_set_add() in nodeWindowAgg.c:5741-5750
Consistent with existing PostgreSQL approach (Hash Join, Sort, etc.)

6.2 Code Review Required (Minor, Decision Needed)

Location: nodeWindowAgg.c:5849,5909,5912
Issue: pos > NUM_ALPHABETS check intent unclear
Current: PATTERN with 28 elements causes error
Question: Is 27 element limit intentional?

6.3 Code Style Fixes (Minor)

- #ifdef RPR_DEBUG: Use elog(DEBUG2, ...) or remove (25 blocks)
- static keyword on separate line: 26 functions to fix
- Whitespace: "regexp !=NULL" -> "regexp != NULL"
- Error message case: "Unrecognized" -> "unrecognized"

6.4 Documentation Fixes (Minor)

- select.sgml: "maximu" -> "maximum"

6.5 Test Additions (Optional)

Black-box Tests (Functional):

Feature Test Case Priority
-------------------------------------------------------------------------------
Variable limit 26 variables success, 27 error Medium
NULL boundary PREV at partition first row Medium

White-box Tests (Coverage) - covered by above black-box tests:

File:Line Target Branch
-------------------------------------------------------------------------------
parse_clause.c:4093 Limit error branch (Variable limit test)
nodeWindowAgg.c:5623 null_slot usage (NULL boundary test)

7. CONCLUSION
-------------

Test Quality: GOOD

Core functionality is thoroughly tested with comprehensive error case coverage.

The patch is well-implemented within its defined scope. Identified issues include
one major concern (combinatorial explosion) and minor style/documentation items.

Recommended actions before commit:
1. Add work_mem-based memory limit for pattern combinations (required)
2. Clarify pos > NUM_ALPHABETS check intent (review needed)
3. Fix code style issues (optional but recommended)
4. Fix documentation typo (trivial)
5. Add tests for variable limit and NULL handling (optional)

Points for discussion (optional):
6. Incremental matching with streaming NFA redesign

Attachment:
- coverage.tgz (gcov HTML report, RPR-modified code only)

---
End of Report

2025년 9월 24일 (수) PM 7:36, Tatsuo Ishii <ishii@postgresql.org>님이 작성:

Attached are the v33 patches for Row pattern recognition. The
difference from v32 is that the raw parse tree printing patch is not
included in v33. This is because now that we have
debug_print_raw_parse.

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

Вложения

coverage.tgz

Re: Row pattern recognition

От

Henson Choi

Дата:

01 января, 09:45:14

Hi hackers,

I ran Valgrind memcheck against the RPR (Row Pattern Recognition) implementation
to verify there are no memory issues introduced by the new Row Pattern Recognition feature.

== Test Environment ==

- PostgreSQL: 19devel (master branch)
- Valgrind: 3.22.0
- Platform: Linux x86_64 (RHEL 9)
- Tests executed: rpr.sql

== Executive Summary ==

+----------------------------------------------------------+
| RPR Implementation: NO MEMORY ISSUES DETECTED |
+----------------------------------------------------------+

All detected leaks originate from existing PostgreSQL core components.
No RPR-related functions appear in any leak stack traces.

== Backend Process 541453: Itemized Leak Analysis ==

Total: 37 bytes definitely lost in 1 block, 1 unique leak context

+================================================================================+
| # | Bytes | Blocks | Allocator | Root Cause | RPR? | Verdict |
+================================================================================+
| 1 | 37 | 1 | strdup | Postmaster init | NO | HARMLESS |
+================================================================================+
TOTAL: 37 bytes in 1 block
RPR-RELATED LEAKS: 0

== RPR Verification ==

All leak stack traces were examined. Key functions NOT present:

- ExecInitWindowAgg (RPR path)
- row_pattern_initial, row_pattern_final
- Any function from src/backend/executor/nodeWindowAgg.c (RPR code paths)
- Any RPR-related parser functions (PATTERN, DEFINE)

The RPR tests exercised window functions with PATTERN, DEFINE clauses
(e.g., PATTERN (START UP+ DOWN+)), yet none of these code paths appear in leaks.

== Process Summary ==

+--------+----------------+-----------------+----------+--------------------+
| PID | Type | Definitely Lost | Errors | Notes |
+--------+----------------+-----------------+----------+--------------------+
| 541303 | postmaster | 37 bytes | 1 | strdup at startup |
| 541453 | backend | 37 bytes | 12 | See above analysis |
| 541314 | checkpointer | 37 bytes | 1 | strdup at startup |
| 541315 | bgwriter | 37 bytes | 1 | strdup at startup |
| 541316 | walwriter | 37 bytes | 1 | strdup at startup |
| 541317 | autovacuum | 37 bytes | 1 | strdup at startup |
| 541318 | logical rep | 37 bytes | 1 | strdup at startup |
| 541319 | syslogger | 37 bytes | 1 | strdup at startup |
| Others | aux workers | 37 bytes each | 1 each | strdup at startup |
+--------+----------------+-----------------+----------+--------------------+

== Possibly Lost (LLVM JIT) ==

Additionally, ~16KB in ~165 blocks reported as "possibly lost" from LLVM JIT:
- llvm::MDTuple, llvm::User allocations
- Origin: llvm_compile_expr -> jit_compile_expr -> ExecReadyExpr
- These are LLVM's internal caches, not PostgreSQL leaks

== Other Memory Errors ==

+--------------------------------------------------------------------------+
| NO memory access errors detected: |
| - Invalid read/write: 0 |
| - Use of uninitialized value: 0 |
| - Conditional jump on uninitialized value: 0 |
| - Syscall param errors: 0 |
| - Overlap in memcpy/strcpy: 0 |
+--------------------------------------------------------------------------+

Suppressed errors (via valgrind.supp):
- 541453 (backend): 46 errors from 3 contexts suppressed
- Other processes: ~0 errors each suppressed

Suppression rules in valgrind.supp (12 categories):
1. Padding - write() syscalls with uninitialized struct padding
2. CRC - CRC32 calculation on padded buffers
3. Overlap - memcpy overlap in bootstrap/relmap
4. glibc - Dynamic loader internals
5. Regex/Pattern - RE_compile_and_cache, patternsel
6. Planner - query_planner memcpy overlap
7. Cache - RelCache, CatCache, TypeCache possible leaks
8. XLogReader - WAL reader allocations
9. Parallel - ParallelWorkerMain allocations
10. AutoVacuum - AutoVacWorkerMain allocations
11. GUC/Backend - set_config_option, InitPostgres
12. PL/pgSQL - plpgsql_compile, SPI_prepare, CachedPlan

All suppressed errors are known, harmless PostgreSQL behaviors.
See attached valgrind.supp for details.

== Conclusion ==

The RPR implementation introduces NO memory leaks. All detected issues
are pre-existing PostgreSQL behaviors related to:

1. Postmaster startup allocations
2. LLVM JIT internal caching

These are by-design patterns that trade minimal memory for performance.

== Files Attached ==

- valgrind.supp: Suppression file used for this test
- 541303.log: Postmaster log
- 541453.log: Backend log with full leak details (56KB)
- 5413xx.log: Auxiliary process logs

--
Regards

2025년 12월 23일 (화) PM 10:28, Tatsuo Ishii <ishii@postgresql.org>님이 작성:

Attached are the v36 patches, just for rebasing.

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

Вложения

valgrind.tgz

Re: Row pattern recognition

От

Tatsuo Ishii

Дата:

02 января, 04:26:49

Hi Henson Choi,

> Hi hackers,
> 
> I ran Valgrind memcheck against the RPR (Row Pattern Recognition)
> implementation
> to verify there are no memory issues introduced by the new Row Pattern
> Recognition feature.

Thanks for the test!

> == Conclusion ==
> 
> The RPR implementation introduces NO memory leaks.

Glad to hear that.
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

Re: Row pattern recognition

От

Tatsuo Ishii

Дата:

02 января, 05:16:15

Hi Henson,

Thank you for the review! I will take care the issues (it will take
sometime).

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

> SQL/RPR Patch Review Report
> ============================
> 
> Patch: Row Pattern Recognition (SQL:2016)
> Commitfest: https://commitfest.postgresql.org/patch/4460
> 
> Review Methodology:
>   This review focused on quality assessment, not line-by-line code audit.
>   Key code paths and quality issues were examined with surrounding context
>   when concerns arose. Documentation files were reviewed with AI-assisted
>   grammar and typo checking. Code coverage was measured using gcov and
>   custom analysis tools.
> 
> Limitations:
>   - Not a security audit or formal verification
>   - Parser and executor internals reviewed at module level, not exhaustively
>   - Performance characteristics not benchmarked
> 
> 
> TABLE OF CONTENTS
> -----------------
> 
> 1. Executive Summary
> 2. Issues Found
>    2.1 Critical / Major
>    2.2 Minor Issues (Review Needed)
>    2.3 Minor Issues (Style)
>    2.4 Suggestions for Discussion
> 3. Test Coverage
>    3.1 Covered Areas
>    3.2 Untested Items
>    3.3 Unimplemented Features (No Test Needed)
> 4. Code Coverage Analysis
>    4.1 Overall Coverage
>    4.2 Coverage by File
>    4.3 Uncovered Code Risk Assessment
> 5. Commit Analysis
> 6. Recommendations
> 7. Conclusion
> 
> 
> 1. EXECUTIVE SUMMARY
> --------------------
> 
> Overall assessment: GOOD
> 
> The SQL/RPR patch demonstrates solid implementation quality within its
> defined scope. Code follows PostgreSQL coding standards (with minor style
> issues), test coverage is comprehensive at 96.4%, and documentation is
> thorough with only minor typos.
> 
> Rating by Area:
> - Code Quality:     Good (PostgreSQL style compliant, 26 static style fixes
> recommended)
> - Test Coverage:    Good (96.4% line coverage, 28 test cases)
> - Documentation:    Good (Complete, 1 minor typo)
> - Build/Regress:    Pass (make check-world, 753 tests passed)
> 
> 
> 2. ISSUES FOUND
> ---------------
> 
> 2.1 Critical / Major
> 
> #1 [Code] Greedy pattern combinatorial explosion
>    Pattern: PATTERN (A+ B+ C+) with DEFINE A AS TRUE, B AS TRUE, C AS TRUE
>    Impact: Memory exhaustion or infinite wait
>    Recommendation: Add work_mem-based memory limit (error on exceed)
> 
>    Evidence - No memory limit in current code:
>    - nodeWindowAgg.c:5718-5722 string_set_init(): palloc() unconditional
>    - nodeWindowAgg.c:5741-5750 string_set_add(): set_size *= 2; repalloc()
> unlimited
>    - nodeWindowAgg.c:5095-5174 generate_patterns(): Triple loop, no limit
> 
>    Only work_mem usage in RPR (nodeWindowAgg.c:1297):
>      winstate->buffer = tuplestore_begin_heap(false, false, work_mem);
>    -> For tuple buffer, unrelated to pattern combination memory (StringSet)
> 
> 2.2 Minor Issues (Review Needed)
> 
> #1 [Code] nodeWindowAgg.c:5849,5909,5912
>    pos > NUM_ALPHABETS check - intent unclear, causes error at 28 PATTERN
> elements
> 
>    Reproduction:
>    - PATTERN (A B C ... Z A) (27 elements) -> OK
>    - PATTERN (A B C ... Z A B) (28 elements) -> ERROR "initial is not valid
> char: b"
> 
> 2.3 Minor Issues (Style)
> 
> #1 [Code] nodeWindowAgg.c (25 blocks)
>    #ifdef RPR_DEBUG -> recommend elog(DEBUG2, ...) or remove
> 
> #2 [Code] src/backend/executor/nodeWindowAgg.c
>    static keyword on separate line (26 functions)
> 
> #3 [Code] src/backend/utils/adt/ruleutils.c
>    Whitespace typo: "regexp !=NULL" -> "regexp != NULL"
> 
> #4 [Code] nodeWindowAgg.c:4619
>    Error message case: "Unrecognized" -> "unrecognized" (PostgreSQL style)
> 
> #5 [Doc] doc/src/sgml/ref/select.sgml:1128
>    Typo: "maximu" -> "maximum"
> 
> 2.4 Suggestions for Discussion
> 
> #1 Incremental matching with streaming NFA redesign
>    Benefits:
>    - O(n) time complexity per row (current: exponential in worst case)
>    - Bounded memory via state merging and context absorption
>    - Natural extension for OR patterns, {n,m} quantifiers, nested groups
>    - Enables MEASURES clause with incremental aggregates
>    - Proven approach in CEP engines (Flink, Esper)
> 
> 
> 3. TEST COVERAGE
> ----------------
> 
> 3.1 Covered Areas
> 
> - PATTERN clause: +, *, ? quantifiers (line 41, 71, 86)
> - DEFINE clause: Variable conditions, auto-TRUE for missing (line 120-133)
> - PREV/NEXT functions: Single argument (line 44, 173)
> - AFTER MATCH SKIP: TO NEXT ROW (line 182), PAST LAST ROW (line 198)
> - Aggregate integration: sum, avg, count, max, min (line 258-277)
> - Window function integration: first_value, last_value, nth_value (line
> 34-35)
> - JOIN/CTE: JOIN (line 313), WITH (line 324)
> - View: VIEW creation, pg_get_viewdef (line 390-406)
> - Error cases: 7 error scenarios (line 409-532)
> - Large partition: 5000 row smoke test (line 360-387)
> - ROWS BETWEEN offset: CURRENT ROW AND 2 FOLLOWING (line 244)
> 
> 3.2 Untested Items
> 
> Feature                  Priority   Recommendation
> -------------------------------------------------------------------------------
> 26 variable limit        Medium     Test 26 variables success, 27th
> variable error
> NULL value handling      Low        Test PREV(col) where col or previous
> row is NULL
> 
> Sample test for 26 variable limit:
> 
>     -- Should fail with 27th variable (parser error, no table needed)
>     SELECT * FROM (SELECT 1 AS x) t
>     WINDOW w AS (
>       ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
>       PATTERN (a b c d e f g h i j k l m n o p q r s t u v w x y z aa)
>       DEFINE a AS TRUE, b AS TRUE, c AS TRUE, d AS TRUE, e AS TRUE, f AS
> TRUE,
>              g AS TRUE, h AS TRUE, i AS TRUE, j AS TRUE, k AS TRUE, l AS
> TRUE,
>              m AS TRUE, n AS TRUE, o AS TRUE, p AS TRUE, q AS TRUE, r AS
> TRUE,
>              s AS TRUE, t AS TRUE, u AS TRUE, v AS TRUE, w AS TRUE, x AS
> TRUE,
>              y AS TRUE, z AS TRUE, aa AS TRUE
>     );
>     -- ERROR: number of row pattern definition variable names exceeds 26
> 
> Sample test for NULL handling:
> 
>     CREATE TEMP TABLE stock_null (company TEXT, tdate DATE, price INTEGER);
>     INSERT INTO stock_null VALUES ('c1', '2023-07-01', 100);
>     INSERT INTO stock_null VALUES ('c1', '2023-07-02', NULL);  -- NULL in
> middle
>     INSERT INTO stock_null VALUES ('c1', '2023-07-03', 200);
>     INSERT INTO stock_null VALUES ('c1', '2023-07-04', 150);
> 
>     SELECT company, tdate, price, count(*) OVER w AS match_count
>     FROM stock_null
>     WINDOW w AS (
>       PARTITION BY company
>       ORDER BY tdate
>       ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
>       PATTERN (START UP DOWN)
>       DEFINE START AS TRUE, UP AS price > PREV(price), DOWN AS price <
> PREV(price)
>     );
>     -- Result: all rows show match_count = 0 (NULL breaks pattern matching)
> 
> 3.3 Unimplemented Features (No Test Needed)
> 
> Per documentation (select.sgml:1133-1136):
> - MEASURES clause: Not implemented
> - SUBSET clause: Not implemented
> - AFTER MATCH variants: Only SKIP TO NEXT ROW, PAST LAST ROW supported
> 
> Not in documentation, verified by testing:
> - {n,m} quantifier: Parser error ("syntax error at or near {")
> - (A|B) OR pattern: Not supported (parsed as single variable name "a|b")
> - (A B)+ compound repetition: Parser accepts, but execution returns 0
> matches
> 
> 
> 4. CODE COVERAGE ANALYSIS
> -------------------------
> 
> 4.1 Overall Coverage
> 
> 96.4% (732 / 759 lines)
> 
> 4.2 Coverage by File (major RPR-modified files)
> 
> nodeWindowAgg.c:    96.6% (560/580) - Pattern matching execution engine
> parse_clause.c:     96.7% (88/91) - PATTERN/DEFINE analysis
> ruleutils.c:        93.1% (54/58) - pg_get_viewdef output
> 
> 4.3 Uncovered Code Risk Assessment
> 
> Total: 27 lines uncovered
> 
> Medium Risk (2 items) - Test addition recommended (see section 3.2):
> - parse_clause.c:4093: transformDefineClause - 27th variable error
> - nodeWindowAgg.c:5623: get_slots - null_slot for PREV at partition first
> row
> 
> Low Risk (25 items) - Defensive code / Unreachable:
> - nodeWindowAgg.c:3074: attno_map_walker - Parser validates arg count
> - nodeWindowAgg.c:3137-3138: rpr_navigation_walker - switch default
> - nodeWindowAgg.c:3566: window_gettupleslot - Before mark position
> - nodeWindowAgg.c:4289: WinGetFuncArgInFrame - isout flag
> - nodeWindowAgg.c:4393: WinGetSlotInFrame - Boundary check
> - nodeWindowAgg.c:4618-4619: row_is_in_reduced_frame - switch default
> - nodeWindowAgg.c:4697: register_reduced_frame_map - pos < 0 check
> - nodeWindowAgg.c:5007: search_str_set - NULL return continue
> - nodeWindowAgg.c:5405: do_pattern_match - Regex compile error
> - nodeWindowAgg.c:5435,5437-5438: do_pattern_match - Regex exec error
> - nodeWindowAgg.c:5700: pattern_initial - Variable not found
> - nodeWindowAgg.c:5776: string_set_get - Index range check
> - nodeWindowAgg.c:5850: variable_pos_register - a-z range check
> - nodeWindowAgg.c:5910: variable_pos_fetch - a-z range check
> - nodeWindowAgg.c:5913: variable_pos_fetch - Index range check
> - parse_clause.c:3989: transformDefineClause - A_Expr type check
> - parse_clause.c:4145: transformPatternClause - A_Expr type check
> - ruleutils.c:6904-6908: get_rule_windowspec - SKIP TO NEXT ROW output
> 
> Conclusion: Most uncovered code consists of defensive error handling or
> code unreachable due to parser pre-validation. No security or functional
> risk.
> 
> 
> 5. COMMIT ANALYSIS
> ------------------
> 
> 8 sequential commits:
> 
> Commit  Area            Files   +/-         Key Content
> -------------------------------------------------------------------------------
> 1       Raw Parser      4       +174/-16    gram.y grammar (PATTERN/DEFINE)
> 2       Parse/Analysis  4       +277/-1     parse_clause.c analysis logic
> 3       Rewriter        1       +109/-0     pg_get_viewdef extension
> 4       Planner         5       +73/-3      WindowAgg plan node extension
> 5       Executor        4       +1,942/-11  CORE: Pattern matching engine
> (+1,850)
> 6       Docs            3       +192/-7     advanced.sgml, func-window.sgml
> 7       Tests           3       +1,585/-1   rpr.sql (532), rpr.out (1,052)
> 8       typedefs        1       +6/-0       pgindent support
> 
> Code Change Statistics:
> - Total files: 25
> - Lines added: 4,358
> - Lines deleted: 39
> - Net increase: +4,319 lines
> 
> 
> 6. RECOMMENDATIONS
> ------------------
> 
> 6.1 Combinatorial Explosion Prevention (Major, Required)
> 
> Add work_mem-based memory limit for StringSet allocation.
> Location: string_set_add() in nodeWindowAgg.c:5741-5750
> Consistent with existing PostgreSQL approach (Hash Join, Sort, etc.)
> 
> 6.2 Code Review Required (Minor, Decision Needed)
> 
> Location: nodeWindowAgg.c:5849,5909,5912
> Issue: pos > NUM_ALPHABETS check intent unclear
> Current: PATTERN with 28 elements causes error
> Question: Is 27 element limit intentional?
> 
> 6.3 Code Style Fixes (Minor)
> 
> - #ifdef RPR_DEBUG: Use elog(DEBUG2, ...) or remove (25 blocks)
> - static keyword on separate line: 26 functions to fix
> - Whitespace: "regexp !=NULL" -> "regexp != NULL"
> - Error message case: "Unrecognized" -> "unrecognized"
> 
> 6.4 Documentation Fixes (Minor)
> 
> - select.sgml: "maximu" -> "maximum"
> 
> 6.5 Test Additions (Optional)
> 
> Black-box Tests (Functional):
> 
> Feature              Test Case                                    Priority
> -------------------------------------------------------------------------------
> Variable limit       26 variables success, 27 error               Medium
> NULL boundary        PREV at partition first row                  Medium
> 
> White-box Tests (Coverage) - covered by above black-box tests:
> 
> File:Line                          Target Branch
> -------------------------------------------------------------------------------
> parse_clause.c:4093                Limit error branch (Variable limit test)
> nodeWindowAgg.c:5623               null_slot usage (NULL boundary test)
> 
> 
> 7. CONCLUSION
> -------------
> 
> Test Quality: GOOD
> 
> Core functionality is thoroughly tested with comprehensive error case
> coverage.
> 
> The patch is well-implemented within its defined scope. Identified issues
> include
> one major concern (combinatorial explosion) and minor style/documentation
> items.
> 
> Recommended actions before commit:
> 1. Add work_mem-based memory limit for pattern combinations (required)
> 2. Clarify pos > NUM_ALPHABETS check intent (review needed)
> 3. Fix code style issues (optional but recommended)
> 4. Fix documentation typo (trivial)
> 5. Add tests for variable limit and NULL handling (optional)
> 
> Points for discussion (optional):
> 6. Incremental matching with streaming NFA redesign
> 
> 
> Attachment:
> - coverage.tgz (gcov HTML report, RPR-modified code only)
> 
> 
> ---
> End of Report
> 
> 2025년 9월 24일 (수) PM 7:36, Tatsuo Ishii <ishii@postgresql.org>님이 작성:
> 
>> Attached are the v33 patches for Row pattern recognition.  The
>> difference from v32 is that the raw parse tree printing patch is not
>> included in v33. This is because now that we have
>> debug_print_raw_parse.
>>
>> Best regards,
>> --
>> Tatsuo Ishii
>> SRA OSS K.K.
>> English: http://www.sraoss.co.jp/index_en/
>> Japanese:http://www.sraoss.co.jp
>>

Re: Row pattern recognition

От

Henson Choi

Дата:

02 января, 05:30:31

Hi Ishii-san,

Glad the review was useful!

2026년 1월 2일 (금) AM 11:16, Tatsuo Ishii <ishii@postgresql.org>님이 작성:

Hi Henson,

Thank you for the review! I will take care the issues (it will take
sometime).

If you have other patches that need pre-analysis (coverage, style, valgrind, etc.), feel free to ask anytime. I enjoy this kind of work.

Best regards,
--
Tatsuo Ishii
SRA OSS K.K.
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

> SQL/RPR Patch Review Report
> ============================
>
> Patch: Row Pattern Recognition (SQL:2016)
> Commitfest: https://commitfest.postgresql.org/patch/4460
>
> Review Methodology:
> This review focused on quality assessment, not line-by-line code audit.
> Key code paths and quality issues were examined with surrounding context
> when concerns arose. Documentation files were reviewed with AI-assisted
> grammar and typo checking. Code coverage was measured using gcov and
> custom analysis tools.
>
> Limitations:
> - Not a security audit or formal verification
> - Parser and executor internals reviewed at module level, not exhaustively
> - Performance characteristics not benchmarked
>
>
> TABLE OF CONTENTS
> -----------------
>
> 1. Executive Summary
> 2. Issues Found
> 2.1 Critical / Major
> 2.2 Minor Issues (Review Needed)
> 2.3 Minor Issues (Style)
> 2.4 Suggestions for Discussion
> 3. Test Coverage
> 3.1 Covered Areas
> 3.2 Untested Items
> 3.3 Unimplemented Features (No Test Needed)
> 4. Code Coverage Analysis
> 4.1 Overall Coverage
> 4.2 Coverage by File
> 4.3 Uncovered Code Risk Assessment
> 5. Commit Analysis
> 6. Recommendations
> 7. Conclusion
>
>
> 1. EXECUTIVE SUMMARY
> --------------------
>
> Overall assessment: GOOD
>
> The SQL/RPR patch demonstrates solid implementation quality within its
> defined scope. Code follows PostgreSQL coding standards (with minor style
> issues), test coverage is comprehensive at 96.4%, and documentation is
> thorough with only minor typos.
>
> Rating by Area:
> - Code Quality: Good (PostgreSQL style compliant, 26 static style fixes
> recommended)
> - Test Coverage: Good (96.4% line coverage, 28 test cases)
> - Documentation: Good (Complete, 1 minor typo)
> - Build/Regress: Pass (make check-world, 753 tests passed)
>
>
> 2. ISSUES FOUND
> ---------------
>
> 2.1 Critical / Major
>
> #1 [Code] Greedy pattern combinatorial explosion
> Pattern: PATTERN (A+ B+ C+) with DEFINE A AS TRUE, B AS TRUE, C AS TRUE
> Impact: Memory exhaustion or infinite wait
> Recommendation: Add work_mem-based memory limit (error on exceed)
>
> Evidence - No memory limit in current code:
> - nodeWindowAgg.c:5718-5722 string_set_init(): palloc() unconditional
> - nodeWindowAgg.c:5741-5750 string_set_add(): set_size *= 2; repalloc()
> unlimited
> - nodeWindowAgg.c:5095-5174 generate_patterns(): Triple loop, no limit
>
> Only work_mem usage in RPR (nodeWindowAgg.c:1297):
> winstate->buffer = tuplestore_begin_heap(false, false, work_mem);
> -> For tuple buffer, unrelated to pattern combination memory (StringSet)
>
> 2.2 Minor Issues (Review Needed)
>
> #1 [Code] nodeWindowAgg.c:5849,5909,5912
> pos > NUM_ALPHABETS check - intent unclear, causes error at 28 PATTERN
> elements
>
> Reproduction:
> - PATTERN (A B C ... Z A) (27 elements) -> OK
> - PATTERN (A B C ... Z A B) (28 elements) -> ERROR "initial is not valid
> char: b"
>
> 2.3 Minor Issues (Style)
>
> #1 [Code] nodeWindowAgg.c (25 blocks)
> #ifdef RPR_DEBUG -> recommend elog(DEBUG2, ...) or remove
>
> #2 [Code] src/backend/executor/nodeWindowAgg.c
> static keyword on separate line (26 functions)
>
> #3 [Code] src/backend/utils/adt/ruleutils.c
> Whitespace typo: "regexp !=NULL" -> "regexp != NULL"
>
> #4 [Code] nodeWindowAgg.c:4619
> Error message case: "Unrecognized" -> "unrecognized" (PostgreSQL style)
>
> #5 [Doc] doc/src/sgml/ref/select.sgml:1128
> Typo: "maximu" -> "maximum"
>
> 2.4 Suggestions for Discussion
>
> #1 Incremental matching with streaming NFA redesign
> Benefits:
> - O(n) time complexity per row (current: exponential in worst case)
> - Bounded memory via state merging and context absorption
> - Natural extension for OR patterns, {n,m} quantifiers, nested groups
> - Enables MEASURES clause with incremental aggregates
> - Proven approach in CEP engines (Flink, Esper)
>
>
> 3. TEST COVERAGE
> ----------------
>
> 3.1 Covered Areas
>
> - PATTERN clause: +, *, ? quantifiers (line 41, 71, 86)
> - DEFINE clause: Variable conditions, auto-TRUE for missing (line 120-133)
> - PREV/NEXT functions: Single argument (line 44, 173)
> - AFTER MATCH SKIP: TO NEXT ROW (line 182), PAST LAST ROW (line 198)
> - Aggregate integration: sum, avg, count, max, min (line 258-277)
> - Window function integration: first_value, last_value, nth_value (line
> 34-35)
> - JOIN/CTE: JOIN (line 313), WITH (line 324)
> - View: VIEW creation, pg_get_viewdef (line 390-406)
> - Error cases: 7 error scenarios (line 409-532)
> - Large partition: 5000 row smoke test (line 360-387)
> - ROWS BETWEEN offset: CURRENT ROW AND 2 FOLLOWING (line 244)
>
> 3.2 Untested Items
>
> Feature Priority Recommendation
> -------------------------------------------------------------------------------
> 26 variable limit Medium Test 26 variables success, 27th
> variable error
> NULL value handling Low Test PREV(col) where col or previous
> row is NULL
>
> Sample test for 26 variable limit:
>
> -- Should fail with 27th variable (parser error, no table needed)
> SELECT * FROM (SELECT 1 AS x) t
> WINDOW w AS (
> ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
> PATTERN (a b c d e f g h i j k l m n o p q r s t u v w x y z aa)
> DEFINE a AS TRUE, b AS TRUE, c AS TRUE, d AS TRUE, e AS TRUE, f AS
> TRUE,
> g AS TRUE, h AS TRUE, i AS TRUE, j AS TRUE, k AS TRUE, l AS
> TRUE,
> m AS TRUE, n AS TRUE, o AS TRUE, p AS TRUE, q AS TRUE, r AS
> TRUE,
> s AS TRUE, t AS TRUE, u AS TRUE, v AS TRUE, w AS TRUE, x AS
> TRUE,
> y AS TRUE, z AS TRUE, aa AS TRUE
> );
> -- ERROR: number of row pattern definition variable names exceeds 26
>
> Sample test for NULL handling:
>
> CREATE TEMP TABLE stock_null (company TEXT, tdate DATE, price INTEGER);
> INSERT INTO stock_null VALUES ('c1', '2023-07-01', 100);
> INSERT INTO stock_null VALUES ('c1', '2023-07-02', NULL); -- NULL in
> middle
> INSERT INTO stock_null VALUES ('c1', '2023-07-03', 200);
> INSERT INTO stock_null VALUES ('c1', '2023-07-04', 150);
>
> SELECT company, tdate, price, count(*) OVER w AS match_count
> FROM stock_null
> WINDOW w AS (
> PARTITION BY company
> ORDER BY tdate
> ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
> PATTERN (START UP DOWN)
> DEFINE START AS TRUE, UP AS price > PREV(price), DOWN AS price <
> PREV(price)
> );
> -- Result: all rows show match_count = 0 (NULL breaks pattern matching)
>
> 3.3 Unimplemented Features (No Test Needed)
>
> Per documentation (select.sgml:1133-1136):
> - MEASURES clause: Not implemented
> - SUBSET clause: Not implemented
> - AFTER MATCH variants: Only SKIP TO NEXT ROW, PAST LAST ROW supported
>
> Not in documentation, verified by testing:
> - {n,m} quantifier: Parser error ("syntax error at or near {")
> - (A|B) OR pattern: Not supported (parsed as single variable name "a|b")
> - (A B)+ compound repetition: Parser accepts, but execution returns 0
> matches
>
>
> 4. CODE COVERAGE ANALYSIS
> -------------------------
>
> 4.1 Overall Coverage
>
> 96.4% (732 / 759 lines)
>
> 4.2 Coverage by File (major RPR-modified files)
>
> nodeWindowAgg.c: 96.6% (560/580) - Pattern matching execution engine
> parse_clause.c: 96.7% (88/91) - PATTERN/DEFINE analysis
> ruleutils.c: 93.1% (54/58) - pg_get_viewdef output
>
> 4.3 Uncovered Code Risk Assessment
>
> Total: 27 lines uncovered
>
> Medium Risk (2 items) - Test addition recommended (see section 3.2):
> - parse_clause.c:4093: transformDefineClause - 27th variable error
> - nodeWindowAgg.c:5623: get_slots - null_slot for PREV at partition first
> row
>
> Low Risk (25 items) - Defensive code / Unreachable:
> - nodeWindowAgg.c:3074: attno_map_walker - Parser validates arg count
> - nodeWindowAgg.c:3137-3138: rpr_navigation_walker - switch default
> - nodeWindowAgg.c:3566: window_gettupleslot - Before mark position
> - nodeWindowAgg.c:4289: WinGetFuncArgInFrame - isout flag
> - nodeWindowAgg.c:4393: WinGetSlotInFrame - Boundary check
> - nodeWindowAgg.c:4618-4619: row_is_in_reduced_frame - switch default
> - nodeWindowAgg.c:4697: register_reduced_frame_map - pos < 0 check
> - nodeWindowAgg.c:5007: search_str_set - NULL return continue
> - nodeWindowAgg.c:5405: do_pattern_match - Regex compile error
> - nodeWindowAgg.c:5435,5437-5438: do_pattern_match - Regex exec error
> - nodeWindowAgg.c:5700: pattern_initial - Variable not found
> - nodeWindowAgg.c:5776: string_set_get - Index range check
> - nodeWindowAgg.c:5850: variable_pos_register - a-z range check
> - nodeWindowAgg.c:5910: variable_pos_fetch - a-z range check
> - nodeWindowAgg.c:5913: variable_pos_fetch - Index range check
> - parse_clause.c:3989: transformDefineClause - A_Expr type check
> - parse_clause.c:4145: transformPatternClause - A_Expr type check
> - ruleutils.c:6904-6908: get_rule_windowspec - SKIP TO NEXT ROW output
>
> Conclusion: Most uncovered code consists of defensive error handling or
> code unreachable due to parser pre-validation. No security or functional
> risk.
>
>
> 5. COMMIT ANALYSIS
> ------------------
>
> 8 sequential commits:
>
> Commit Area Files +/- Key Content
> -------------------------------------------------------------------------------
> 1 Raw Parser 4 +174/-16 gram.y grammar (PATTERN/DEFINE)
> 2 Parse/Analysis 4 +277/-1 parse_clause.c analysis logic
> 3 Rewriter 1 +109/-0 pg_get_viewdef extension
> 4 Planner 5 +73/-3 WindowAgg plan node extension
> 5 Executor 4 +1,942/-11 CORE: Pattern matching engine
> (+1,850)
> 6 Docs 3 +192/-7 advanced.sgml, func-window.sgml
> 7 Tests 3 +1,585/-1 rpr.sql (532), rpr.out (1,052)
> 8 typedefs 1 +6/-0 pgindent support
>
> Code Change Statistics:
> - Total files: 25
> - Lines added: 4,358
> - Lines deleted: 39
> - Net increase: +4,319 lines
>
>
> 6. RECOMMENDATIONS
> ------------------
>
> 6.1 Combinatorial Explosion Prevention (Major, Required)
>
> Add work_mem-based memory limit for StringSet allocation.
> Location: string_set_add() in nodeWindowAgg.c:5741-5750
> Consistent with existing PostgreSQL approach (Hash Join, Sort, etc.)
>
> 6.2 Code Review Required (Minor, Decision Needed)
>
> Location: nodeWindowAgg.c:5849,5909,5912
> Issue: pos > NUM_ALPHABETS check intent unclear
> Current: PATTERN with 28 elements causes error
> Question: Is 27 element limit intentional?
>
> 6.3 Code Style Fixes (Minor)
>
> - #ifdef RPR_DEBUG: Use elog(DEBUG2, ...) or remove (25 blocks)
> - static keyword on separate line: 26 functions to fix
> - Whitespace: "regexp !=NULL" -> "regexp != NULL"
> - Error message case: "Unrecognized" -> "unrecognized"
>
> 6.4 Documentation Fixes (Minor)
>
> - select.sgml: "maximu" -> "maximum"
>
> 6.5 Test Additions (Optional)
>
> Black-box Tests (Functional):
>
> Feature Test Case Priority
> -------------------------------------------------------------------------------
> Variable limit 26 variables success, 27 error Medium
> NULL boundary PREV at partition first row Medium
>
> White-box Tests (Coverage) - covered by above black-box tests:
>
> File:Line Target Branch
> -------------------------------------------------------------------------------
> parse_clause.c:4093 Limit error branch (Variable limit test)
> nodeWindowAgg.c:5623 null_slot usage (NULL boundary test)
>
>
> 7. CONCLUSION
> -------------
>
> Test Quality: GOOD
>
> Core functionality is thoroughly tested with comprehensive error case
> coverage.
>
> The patch is well-implemented within its defined scope. Identified issues
> include
> one major concern (combinatorial explosion) and minor style/documentation
> items.
>
> Recommended actions before commit:
> 1. Add work_mem-based memory limit for pattern combinations (required)
> 2. Clarify pos > NUM_ALPHABETS check intent (review needed)
> 3. Fix code style issues (optional but recommended)
> 4. Fix documentation typo (trivial)
> 5. Add tests for variable limit and NULL handling (optional)
>
> Points for discussion (optional):
> 6. Incremental matching with streaming NFA redesign
>
>
> Attachment:
> - coverage.tgz (gcov HTML report, RPR-modified code only)
>
>
> ---
> End of Report
>
> 2025년 9월 24일 (수) PM 7:36, Tatsuo Ishii <ishii@postgresql.org>님이 작성:
>
>> Attached are the v33 patches for Row pattern recognition. The
>> difference from v32 is that the raw parse tree printing patch is not
>> included in v33. This is because now that we have
>> debug_print_raw_parse.
>>
>> Best regards,
>> --
>> Tatsuo Ishii
>> SRA OSS K.K.
>> English: http://www.sraoss.co.jp/index_en/
>> Japanese:http://www.sraoss.co.jp
>>

Best regards,

Henson

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Re: Row pattern recognition

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения