Обсуждение: Latest patches break one of our unit-test, related to RLS
In the same vein as v18betas breaking our unit-tests... (we still don't understand, haven't looked much though). OK, above it's about a major upgrade. So that's one thing. But now, we've ascertain that a particular test is: OK with 16.9 and 17.5 (we cannot test on beta2 anymore) KO with 16.10 and 17.6 (and beta3 too, released at the same time) Which for a minor patch is a surprise to us, not in a good way. So... Did anything change around RLS in this patch series? Anything changed around ROLE INHERIT rules? The symptom is that a LOGIN user lacks ROLEs is used to have. So perhaps some GRANTs are now silently doing nothing, for some unknown reasons. What in these patches could be affecting those areas? Thanks, --DD
On 9/4/25 08:03, Dominique Devienne wrote: > In the same vein as v18betas breaking our unit-tests... > (we still don't understand, haven't looked much though). > > OK, above it's about a major upgrade. So that's one thing. > > But now, we've ascertain that a particular test is: > > OK with 16.9 and 17.5 (we cannot test on beta2 anymore) > KO with 16.10 and 17.6 (and beta3 too, released at the same time) How did you move from one version to the other? > > Which for a minor patch is a surprise to us, not in a good way. > > So... Did anything change around RLS in this patch series? > Anything changed around ROLE INHERIT rules? > > The symptom is that a LOGIN user lacks ROLEs is used to have. > So perhaps some GRANTs are now silently doing nothing, > for some unknown reasons. What are the GRANTs that are done to give the user the privileges? > > What in these patches could be affecting those areas? > > Thanks, --DD > > -- Adrian Klaver adrian.klaver@aklaver.com
On Thu, 2025-09-04 at 17:03 +0200, Dominique Devienne wrote: > OK with 16.9 and 17.5 (we cannot test on beta2 anymore) > KO with 16.10 and 17.6 (and beta3 too, released at the same time) > > Which for a minor patch is a surprise to us, not in a good way. > > So... Did anything change around RLS in this patch series? > Anything changed around ROLE INHERIT rules? > > The symptom is that a LOGIN user lacks ROLEs is used to have. > So perhaps some GRANTs are now silently doing nothing, > for some unknown reasons. > > What in these patches could be affecting those areas? Without a reproducer, that is a wild goose chase. Yours, Laurenz Albe
On Thu, Sep 4, 2025 at 5:18 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote: > On Thu, 2025-09-04 at 17:03 +0200, Dominique Devienne wrote: > > OK with 16.9 and 17.5 (we cannot test on beta2 anymore) > > KO with 16.10 and 17.6 (and beta3 too, released at the same time) > > > > Which for a minor patch is a surprise to us, not in a good way. > > > > So... Did anything change around RLS in this patch series? > > Anything changed around ROLE INHERIT rules? > > > > The symptom is that a LOGIN user lacks ROLEs is used to have. > > So perhaps some GRANTs are now silently doing nothing, > > for some unknown reasons. > > > > What in these patches could be affecting those areas? > > Without a reproducer, that is a wild goose chase. Oh, it's easy to reproduce here... But distilling it down is no small feat. But this is a patch. Patches are not supposed to break things.
On Thu, Sep 4, 2025 at 5:12 PM Adrian Klaver <adrian.klaver@aklaver.com> wrote: > On 9/4/25 08:03, Dominique Devienne wrote: > > But now, we've ascertain that a particular test is: > > OK with 16.9 and 17.5 (we cannot test on beta2 anymore) > > KO with 16.10 and 17.6 (and beta3 too, released at the same time) > How did you move from one version to the other? We first upgraded in place. Broke that test. We then re-installed fresh servers, minus-1 the latest patches. And verified the tests ran OK again,on these. We have several clusters on that machine.
On Thu, Sep 4, 2025 at 5:31 PM Dominique Devienne <ddevienne@gmail.com> wrote: > On Thu, Sep 4, 2025 at 5:12 PM Adrian Klaver <adrian.klaver@aklaver.com> wrote: > > On 9/4/25 08:03, Dominique Devienne wrote: > > > But now, we've ascertain that a particular test is: > > > OK with 16.9 and 17.5 (we cannot test on beta2 anymore) > > > KO with 16.10 and 17.6 (and beta3 too, released at the same time) > > > How did you move from one version to the other? > > We first upgraded in place. Broke that test. We're still naive I guess. And didn't expect a minor upgrade to break us... Given this mishap, we'll do the reverse from now on. Install fresh clusters for the new versions (minor or not), and see what happens. So we don't "advance" our main/primary (testing) clusters until "issues are resolved".
On Fri, Sep 5, 2025 at 1:10 AM Ron Johnson <ronljohnsonjr@gmail.com> wrote: > Dumb question: did you read the release notes? Well, now I have. Carefully. Nothing stands out. There's RLS mentioned, but for indexing only. So no smoking gun. We'll have to investigate... :( > In our case, 17.6 "pg_dump --schema-only" broke in the sense that the md5sum of yesterday's schema dump is different fromtoday's schema dump. Puzzled, I ran "pg_dump --schema-only" twice in a row, but the md5sum was different. > "diff" showed why. Searching the release notes showed me what changed. Thanks for the heads up. We don't depend on pg_dump at all. We have our own COPY BINARY based custom backup/restore to semi-structured SQLite DBs (part relational, part per-row COPY blobs).
On Thu, 2025-09-04 at 17:28 +0200, Dominique Devienne wrote: > But this is a patch. Patches are not supposed to break things. Right. But the PostgreSQL developers aren't perfect either. If you are interested in getting this fixed, make the problem reproducible for us. Yours, Laurenz Albe
On Thu, 2025-09-04 at 17:28 +0200, Dominique Devienne wrote: > But this is a patch. Patches are not supposed to break things. Right. But the PostgreSQL developers aren't perfect either. If you are interested in getting this fixed, make the problem reproducible for us. Yours, Laurenz Albe
On Thu, Sep 4, 2025 at 5:03 PM Dominique Devienne <ddevienne@gmail.com> wrote: > OK with 16.9 and 17.5 (we cannot test on beta2 anymore) > KO with 16.10 and 17.6 (and beta3 too, released at the same time) I've tracked down the regression to this particular query, FWIW: select rolname, rolsuper, rolinherit, rolcreaterole, rolcreatedb, rolcanlogin, rolreplication, rolbypassrls, oid, shobj_description(oid, 'pg_authid') from pg_roles where rolname SIMILAR TO $1 AND pg_has_role(oid, 'SET') order by rolname In 17.5, returns 3 rows. In 17.6, returns 0 rows. I've used a libpq trace on both, diff'd them, and it's the 1st significant difference. Given my troubles with roles, I immediately imagined a change in pg_has_role(). But turns out, it's SIMILAR TO that changed. See trace extract below for both versions. On 17.6 On the two queries below, the first is the real one, (modulo some mild renaming, to Acme and FOO) and the second is one where I replaced the [\d\w] with a _ The correct answer for the 1st should be (7 rows). This particular char I test on, can be / and : but I want to avoid those entries, thus [\d\w] So, it this a regression? A bug fix, and my pattern is somehow wrong? If it's not a bug/regression, what do you suggest we use instead? If it is a bug, any chance it might be in the upcoming v18 release (and associated earlier version patches???) I think I've found the smoking gun. Haven't verified whether our troubles with v18 pre-releases is related. Thanks, --DD PS: From those 7 rows, pg_has_role() is supposed to narrow it down to 3. acme=> select rolname from pg_roles where rolname similar to 'Acme-FOO:8n8igcOH[\d\w]_____________:%' order by 1; rolname --------- (0 rows) acme=> select rolname from pg_roles where rolname similar to 'Acme-FOO:8n8igcOH______________:%' order by 1; rolname ------------------------------------------- ... (14 rows) And the same as above, on 17.5: acme=> select rolname from pg_roles where rolname similar to 'Acme-FOO:8vjqDaeT[\d\w]_____________:%' order by 1; rolname ---------------------------------------- ... (7 rows) acme=> select rolname from pg_roles where rolname similar to 'Acme-FOO:8vjqDaeT______________:%' order by 1; rolname ------------------------------------------- ... (14 rows) --------------- llibpq traces ------------------- 17.5 2025-09-12 13:50:32.267733 B 5 ReadyForQuery I 2025-09-12 13:50:32.267772 F 270 Parse "" " select rolname, rolsuper, rolinherit, rolcreaterole, rolcreatedb, rolcanlogin, rolreplication, rolbypassrls, oid, shobj_description(oid, 'pg_authid') from pg_roles where rolname SIMILAR TO $1 AND pg_has_role(oid, 'SET') order by rolname " 1 25 2025-09-12 13:50:32.267775 F 58 Bind "" "" 1 1 1 38 'Acme-FOO:8vjqDaeT[\d\w]_____________:%' 1 1 2025-09-12 13:50:32.267777 F 6 Describe P "" 2025-09-12 13:50:32.267779 F 9 Execute "" 0 2025-09-12 13:50:32.267780 F 4 Sync 2025-09-12 13:50:32.271148 B 4 ParseComplete 2025-09-12 13:50:32.271161 B 4 BindComplete 2025-09-12 13:50:32.271165 B 302 RowDescription 10 "rolname" 12000 1 19 64 -1 1 "rolsuper" 12000 2 16 1 -1 1 "rolinherit" 12000 3 16 1 -1 1 "rolcreaterole" 12000 4 16 1 -1 1 "rolcreatedb" 12000 5 16 1 -1 1 "rolcanlogin" 12000 6 16 1 -1 1 "rolreplication" 12000 7 16 1 -1 1 "rolbypassrls" 12000 11 16 1 -1 1 "oid" 12000 13 26 4 -1 1 "shobj_description" 0 0 25 65535 -1 1 ... 2025-09-12 13:50:32.271200 B 13 CommandComplete "SELECT 3" in 17.6 2025-09-12 13:50:52.512043 B 5 ReadyForQuery I 2025-09-12 13:50:52.512082 F 270 Parse "" " select rolname, rolsuper, rolinherit, rolcreaterole, rolcreatedb, rolcanlogin, rolreplication, rolbypassrls, oid, shobj_description(oid, 'pg_authid') from pg_roles where rolname SIMILAR TO $1 AND pg_has_role(oid, 'SET') order by rolname " 1 25 2025-09-12 13:50:52.512085 F 58 Bind "" "" 1 1 1 38 'Acme-FOO:8n8igcOH[\d\w]_____________:%' 1 1 2025-09-12 13:50:52.512088 F 6 Describe P "" 2025-09-12 13:50:52.512089 F 9 Execute "" 0 2025-09-12 13:50:52.512091 F 4 Sync 2025-09-12 13:50:52.540088 B 4 ParseComplete 2025-09-12 13:50:52.540104 B 4 BindComplete 2025-09-12 13:50:52.540109 B 302 RowDescription 10 "rolname" 12000 1 19 64 -1 1 "rolsuper" 12000 2 16 1 -1 1 "rolinherit" 12000 3 16 1 -1 1 "rolcreaterole" 12000 4 16 1 -1 1 "rolcreatedb" 12000 5 16 1 -1 1 "rolcanlogin" 12000 6 16 1 -1 1 "rolreplication" 12000 7 16 1 -1 1 "rolbypassrls" 12000 11 16 1 -1 1 "oid" 12000 13 26 4 -1 1 "shobj_description" 0 0 25 65535 -1 1 2025-09-12 13:50:52.540119 B 13 CommandComplete "SELECT 0"
On Fri, 2025-09-12 at 14:22 +0200, Dominique Devienne wrote: > On Thu, Sep 4, 2025 at 5:03 PM Dominique Devienne <ddevienne@gmail.com> wrote: > > OK with 16.9 and 17.5 (we cannot test on beta2 anymore) > > KO with 16.10 and 17.6 (and beta3 too, released at the same time) > > I've tracked down the regression to this particular query, FWIW: > > select rolname, rolsuper, rolinherit, rolcreaterole, > rolcreatedb, rolcanlogin, rolreplication, rolbypassrls, > oid, shobj_description(oid, 'pg_authid') > from pg_roles > where rolname SIMILAR TO $1 AND pg_has_role(oid, 'SET') > order by rolname > > In 17.5, returns 3 rows. > In 17.6, returns 0 rows. That must be commit e3ffc3e91d. That commit fixed a bug in the conversion from SIMILAR TO expressions to POSIX regular expressions. You don't show us that data that match the pattern in 17.5, but not in 17.6. Unless you show us a counterexample, I'd say that the behavior in 17.6 is correct. Minor releases shouldn't change the behavior EXCEPT when the behavior is buggy. Yours, Laurenz Albe
On Fri, Sep 12, 2025 at 2:45 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote: > On Fri, 2025-09-12 at 14:22 +0200, Dominique Devienne wrote: > > On Thu, Sep 4, 2025 at 5:03 PM Dominique Devienne <ddevienne@gmail.com> wrote: > > > OK with 16.9 and 17.5 (we cannot test on beta2 anymore) > > > KO with 16.10 and 17.6 (and beta3 too, released at the same time) > > > > I've tracked down the regression to this particular query, FWIW: > > > > select rolname, rolsuper, rolinherit, rolcreaterole, > > rolcreatedb, rolcanlogin, rolreplication, rolbypassrls, > > oid, shobj_description(oid, 'pg_authid') > > from pg_roles > > where rolname SIMILAR TO $1 AND pg_has_role(oid, 'SET') > > order by rolname > > > > In 17.5, returns 3 rows. > > In 17.6, returns 0 rows. > > That must be commit e3ffc3e91d. > > That commit fixed a bug in the conversion from SIMILAR TO > expressions to POSIX regular expressions. > > You don't show us that data that match the pattern in 17.5, but > not in 17.6. Unless you show us a counterexample, I'd say that > the behavior in 17.6 is correct. > > Minor releases shouldn't change the behavior EXCEPT when the > behavior is buggy. Can't get any simpler than the repro below, can it? So is this buggy or not? Clearly, there's a change in behavior. I tend to call a change in behavior as a regression myself :) But if someone can explain to me how what used to work was incorrect, compared to the documented behavior, I'm willing to change it of course. Any takers? --DD postgres=# show server_version; server_version ---------------- 16.9 (1 row) postgres=# with t(v) as (values ('foo:bar'), ('foo/bar'), ('foo0bar')) select v from t where v similar to 'foo[\d\w]_%'; v --------- foo0bar (1 row) postgres=# \c - - - 5416; psql (17.6, server 16.10) SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, compression: off, ALPN: none) You are now connected to database "postgres" as user "postgres" on host "sr-pau-db" (address "10.65.53.13") at port "5416". postgres=# show server_version; server_version ---------------- 16.10 (1 row) postgres=# with t(v) as (values ('foo:bar'), ('foo/bar'), ('foo0bar')) select v from t where v similar to 'foo[\d\w]_%'; v --- (0 rows) postgres=# \c - - - 5475 psql (17.6, server 17.5) You are now connected to database "postgres" as user "postgres" on host "sr-pau-db" (address "10.65.53.13") at port "5475". postgres=# show server_version; server_version ---------------- 17.5 (1 row) postgres=# with t(v) as (values ('foo:bar'), ('foo/bar'), ('foo0bar')) select v from t where v similar to 'foo[\d\w]_%'; v --------- foo0bar (1 row) postgres=# \c - - - 5417 SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, compression: off, ALPN: postgresql) You are now connected to database "postgres" as user "postgres" on host "sr-pau-db" (address "10.65.53.13") at port "5417". postgres=# show server_version; server_version ---------------- 17.6 (1 row) postgres=# with t(v) as (values ('foo:bar'), ('foo/bar'), ('foo0bar')) select v from t where v similar to 'foo[\d\w]_%'; v --- (0 rows) postgres=# \c - - - 5481 psql (17.6, server 18rc1) WARNING: psql major version 17, server major version 18. Some psql features might not work. You are now connected to database "postgres" as user "postgres" on host "sr-pau-db" (address "10.65.53.13") at port "5481". postgres=# show server_version; server_version ---------------- 18rc1 (1 row) postgres=# with t(v) as (values ('foo:bar'), ('foo/bar'), ('foo0bar')) select v from t where v similar to 'foo[\d\w]_%'; v --- (0 rows) postgres=#
On Fri, Sep 12, 2025 at 3:11 PM Dominique Devienne <ddevienne@gmail.com> wrote: > On Fri, Sep 12, 2025 at 2:45 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote: > > You don't show us that data that match the pattern in 17.5, but > > not in 17.6. Unless you show us a counterexample, I'd say that > > the behavior in 17.6 is correct. I've reread https://www.postgresql.org/docs/current/functions-matching.html#FUNCTIONS-SIMILARTO-REGEXP and especially: > According to the SQL standard, omitting ESCAPE means there is no escape character (rather than defaulting to a backslash),and a zero-length ESCAPE value is disallowed. PostgreSQL's behavior in this regard is therefore slightly nonstandard. and also > Another nonstandard extension is that following the escape character with a letter or digit provides access to the escapesequences defined for POSIX regular expressions; see Table 9.20, Table 9.21, and Table 9.22 below. Table 9.21. Regular Expression Class-Shorthand Escapes \d matches any digit, like [[:digit:]] \w matches any word character, like [[:word:]] So I don't see how my `... where v similar to 'foo[\d\w]_%'` is incorrect. So again, is this a bug / regression or not? Thanks, --DD
On Fri, Sep 12, 2025 at 3:24 PM Dominique Devienne <ddevienne@gmail.com> wrote: > On Fri, Sep 12, 2025 at 3:11 PM Dominique Devienne <ddevienne@gmail.com> wrote: > So I don't see how my `... where v similar to 'foo[\d\w]_%'` is incorrect. > So again, is this a bug / regression or not? Thanks, --DD If I use (x|y) instead of [xy] it seems to behave correctly. Whether x is the full-length POSIX class, or the shorthand notation. This DOES look like a bug, no? I've done regexes for a long time, and these two forms should be equivalent IMHO. --DD postgres=# show server_version; server_version ---------------- 18rc1 (1 row) postgres=# with t(v) as (values ('foo:bar'), ('foo/bar'), ('foo0bar')) select v from t where v similar to 'foo[\d\w]_%'; v --- (0 rows) postgres=# with t(v) as (values ('foo:bar'), ('foo/bar'), ('foo0bar')) select v from t where v similar to 'foo[[[:digit:]][[:word:]]]_%'; v --- (0 rows) postgres=# with t(v) as (values ('foo:bar'), ('foo/bar'), ('foo0bar')) select v from t where v similar to 'foo([[:digit:]]|[[:word:]])_%'; v --------- foo0bar (1 row) postgres=# with t(v) as (values ('foo:bar'), ('foo/bar'), ('foo0bar')) select v from t where v similar to 'foo(\d|\w)_%'; v --------- foo0bar (1 row)
On Fri, Sep 12, 2025 at 3:29 PM Dominique Devienne <ddevienne@gmail.com> wrote: > On Fri, Sep 12, 2025 at 3:24 PM Dominique Devienne <ddevienne@gmail.com> wrote: > > On Fri, Sep 12, 2025 at 3:11 PM Dominique Devienne <ddevienne@gmail.com> wrote: > > > So I don't see how my `... where v similar to 'foo[\d\w]_%'` is incorrect. > > So again, is this a bug / regression or not? Thanks, --DD > > If I use (x|y) instead of [xy] it seems to behave correctly. > Whether x is the full-length POSIX class, or the shorthand notation. > This DOES look like a bug, no? I've done regexes for a long time, > and these two forms should be equivalent IMHO. --DD > > postgres=# show server_version; > server_version > ---------------- > 18rc1 > (1 row) > > > postgres=# with t(v) as (values ('foo:bar'), ('foo/bar'), ('foo0bar')) > select v from t where v similar to 'foo[\d\w]_%'; > v > --- > (0 rows) > > > postgres=# with t(v) as (values ('foo:bar'), ('foo/bar'), ('foo0bar')) > select v from t where v similar to 'foo[[[:digit:]][[:word:]]]_%'; > v > --- > (0 rows) > > > postgres=# with t(v) as (values ('foo:bar'), ('foo/bar'), ('foo0bar')) > select v from t where v similar to 'foo([[:digit:]]|[[:word:]])_%'; > v > --------- > foo0bar > (1 row) > > > postgres=# with t(v) as (values ('foo:bar'), ('foo/bar'), ('foo0bar')) > select v from t where v similar to 'foo(\d|\w)_%'; > v > --------- > foo0bar > (1 row) For completeness: postgres=# with t(v) as (values ('foo:bar'), ('foo/bar'), ('foo0bar')) select v from t where v similar to 'foo[0-9a-zA-Z]_%'; v --------- foo0bar (1 row)
On Fri, Sep 12, 2025 at 9:35 PM Dominique Devienne <ddevienne@gmail.com> wrote: > > On Fri, Sep 12, 2025 at 3:29 PM Dominique Devienne <ddevienne@gmail.com> wrote: > > On Fri, Sep 12, 2025 at 3:24 PM Dominique Devienne <ddevienne@gmail.com> wrote: > > > On Fri, Sep 12, 2025 at 3:11 PM Dominique Devienne <ddevienne@gmail.com> wrote: > > > > > So I don't see how my `... where v similar to 'foo[\d\w]_%'` is incorrect. > > > So again, is this a bug / regression or not? Thanks, --DD > > > > If I use (x|y) instead of [xy] it seems to behave correctly. > > Whether x is the full-length POSIX class, or the shorthand notation. > > This DOES look like a bug, no? I've done regexes for a long time, > > and these two forms should be equivalent IMHO. --DD > > > > postgres=# show server_version; > > server_version > > ---------------- > > 18rc1 > > (1 row) > > > > > > postgres=# with t(v) as (values ('foo:bar'), ('foo/bar'), ('foo0bar')) > > select v from t where v similar to 'foo[\d\w]_%'; > > v > > --- > > (0 rows) > > > > > > postgres=# with t(v) as (values ('foo:bar'), ('foo/bar'), ('foo0bar')) > > select v from t where v similar to 'foo[[[:digit:]][[:word:]]]_%'; > > v > > --- > > (0 rows) > > The above two examples are the same, per (Table 9.21. Regular Expression Class-Shorthand Escapes) https://www.postgresql.org/docs/current/functions-matching.html#FUNCTIONS-SIMILARTO-REGEXP my guess why 'foo0bar' will fail for 'foo[[[:digit:]][[:word:]]]_%'; 1. process character 0, it does meet [[:digits]] character class. 2. process character b, it does not meet [[:digits]], then fails, it won't check again whether character b is satisfied with [[:word:]] or not.
On Fri, Sep 12, 2025 at 3:54 PM jian he <jian.universality@gmail.com> wrote: > > > select v from t where v similar to 'foo[\d\w]_%'; > > > select v from t where v similar to 'foo[[[:digit:]][[:word:]]]_%'; > The above two examples are the same, per > (Table 9.21. Regular Expression Class-Shorthand Escapes) Of course. > my guess why 'foo0bar' will fail for 'foo[[[:digit:]][[:word:]]]_%'; > 1. process character 0, it does meet [[:digits]] character class. > 2. process character b, it does not meet [[:digits]], then fails, > it won't check again whether character b is satisfied with [[:word:]] or not. Then you don't know what [...] means in regexes I'm afraid. --DD
Dominique Devienne <ddevienne@gmail.com> writes: >> This DOES look like a bug, no? I've done regexes for a long time, >> and these two forms should be equivalent IMHO. --DD Yeah, I agree it's busted. You can use EXPLAIN VERBOSE to see the translated-to-POSIX pattern, and it's wrong: regression=# explain verbose with t(v) as (values ('foo:bar'), ('foo/bar'), ('foo0bar')) select v from t where v similar to 'foo[\d\w]_%'; QUERY PLAN -------------------------------------------------------------- Values Scan on "*VALUES*" (cost=0.00..0.05 rows=1 width=32) Output: "*VALUES*".column1 Filter: ("*VALUES*".column1 ~ '^(?:foo[\d\w]_%)$'::text) (3 rows) The _ and % are not getting converted to their POSIX equivalents ("." and ".*"). Your other example still does that correctly: regression=# explain verbose with t(v) as (values ('foo:bar'), ('foo/bar'), ('foo0bar')) select v from t where v similar to 'foo[0-9a-zA-Z]_%'; QUERY PLAN ------------------------------------------------------------------ Values Scan on "*VALUES*" (cost=0.00..0.05 rows=1 width=32) Output: "*VALUES*".column1 Filter: ("*VALUES*".column1 ~ '^(?:foo[0-9a-zA-Z]..*)$'::text) (3 rows) So e3ffc3e91 was at least one brick shy of a load. regards, tom lane
On Fri, Sep 12, 2025 at 4:07 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Dominique Devienne <ddevienne@gmail.com> writes: > >> This DOES look like a bug, no? I've done regexes for a long time, > >> and these two forms should be equivalent IMHO. --DD > > Yeah, I agree it's busted. You can use EXPLAIN VERBOSE to see the > translated-to-POSIX pattern, and it's wrong. Thanks for confirming Tom. And teaching me about that EXPLAIN VERBOSE trick. I've worked-around that regression in our code, going to (x|y) instead. What's weird is that those are still followed by _ and %, just like [xy], so it's as-if seeing [\d\w], it stops converting the pattern... Weird. This misadventure kinda tells me I should maybe give up on SIMILAR TO and just use ~, to bypass that conversion-to-POSIX... --DD
On 2025-Sep-12, Tom Lane wrote: > Dominique Devienne <ddevienne@gmail.com> writes: > >> This DOES look like a bug, no? I've done regexes for a long time, > >> and these two forms should be equivalent IMHO. --DD > > Yeah, I agree it's busted. You can use EXPLAIN VERBOSE to see the > translated-to-POSIX pattern, and it's wrong: And now we also have https://postgr.es/m/41a37137-f8bb-8fc5-2948-31b528f166dc@bfw-online.de CC'ed the committer here. -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ "La victoria es para quien se atreve a estar solo"
On Fri, 2025-09-12 at 10:07 -0400, Tom Lane wrote: > Dominique Devienne <ddevienne@gmail.com> writes: > > > This DOES look like a bug, no? I've done regexes for a long time, > > > and these two forms should be equivalent IMHO. --DD > > Yeah, I agree it's busted. You can use EXPLAIN VERBOSE to see the > translated-to-POSIX pattern, and it's wrong: > > regression=# explain verbose with t(v) as (values ('foo:bar'), ('foo/bar'), ('foo0bar')) > select v from t where v similar to 'foo[\d\w]_%'; > QUERY PLAN > -------------------------------------------------------------- > Values Scan on "*VALUES*" (cost=0.00..0.05 rows=1 width=32) > Output: "*VALUES*".column1 > Filter: ("*VALUES*".column1 ~ '^(?:foo[\d\w]_%)$'::text) > (3 rows) > > The _ and % are not getting converted to their POSIX equivalents > ("." and ".*"). Indeed, and I have to take the blame for introducing a bug in a minor release :^( The attached patch should fix the problem. Yours, Laurenz Albe
Вложения
Laurenz Albe <laurenz.albe@cybertec.at> writes: > On Fri, 2025-09-12 at 10:07 -0400, Tom Lane wrote: >> The _ and % are not getting converted to their POSIX equivalents >> ("." and ".*"). > Indeed, and I have to take the blame for introducing a bug in a minor > release :^( > The attached patch should fix the problem. I had not particularly studied the new charclass-parsing logic. Looking at it now, this bit further down (lines 871ff) looks fishy: if (pchar == ']' && charclass_start > 2) charclass_depth--; else if (pchar == '[') charclass_depth++; /* * If there is a caret right after the opening bracket, it negates * the character class, but a following closing bracket should * still be treated as a normal character. That holds only for * the first caret, so only the values 1 and 2 mean that closing * brackets should be taken literally. */ if (pchar == '^') charclass_start++; else charclass_start = 3; /* definitely past the start */ Should not we be setting charclass_start to 1 after incrementing charclass_depth? That is, I'd be more comfortable if this logic looked like if (pchar == ']' && charclass_start > 2) charclass_depth--; else if (pchar == '[') { /* start of a nested character class */ charclass_depth++; charclass_start = 1; } else if (pchar == '^') charclass_start++; else charclass_start = 3; /* definitely past the start */ I haven't experimented, but it looks like this might misprocess ^ or ] at the start of a nested character class. regards, tom lane
On Fri, 2025-09-12 at 20:12 -0400, Tom Lane wrote: > I had not particularly studied the new charclass-parsing logic. > Looking at it now, this bit further down (lines 871ff) looks > fishy: > > if (pchar == ']' && charclass_start > 2) > charclass_depth--; > else if (pchar == '[') > charclass_depth++; > > /* > * If there is a caret right after the opening bracket, it negates > * the character class, but a following closing bracket should > * still be treated as a normal character. That holds only for > * the first caret, so only the values 1 and 2 mean that closing > * brackets should be taken literally. > */ > if (pchar == '^') > charclass_start++; > else > charclass_start = 3; /* definitely past the start */ > > Should not we be setting charclass_start to 1 after incrementing > charclass_depth? What I call "charclass depth" is misleading, I am afraid. Really, it should be "bracket depth". Only the outermost pair of brackets starts an actual character class. Example: []abc[:digit:]] A caret or closing bracket right after the inner opening bracket wouldn't be a special character, and I think it would never be legal. Unfortunately, this is all pretty complicated. Perhaps s/charclass_depth/bracket_depth/ would be a good idea. Yours, Laurenz Albe
Laurenz Albe <laurenz.albe@cybertec.at> writes: > On Fri, 2025-09-12 at 20:12 -0400, Tom Lane wrote: >> Should not we be setting charclass_start to 1 after incrementing >> charclass_depth? > What I call "charclass depth" is misleading, I am afraid. > Really, it should be "bracket depth". Only the outermost pair of brackets > starts an actual character class. Example: > []abc[:digit:]] > A caret or closing bracket right after the inner opening bracket wouldn't > be a special character, and I think it would never be legal. Ah, got it. But this logic definitely deserves more comments. What do you think of something like if (pchar == ']' && charclass_start > 2) { /* found the real end of a bracket pair */ charclass_depth--; /* past start of outer brackets, so keep charclass_start > 2 */ } else if (pchar == '[') { /* start of a nested bracket pair */ charclass_depth++; /* leading ^ or ] in this context is not special */ charclass_start = 3; } else if (pchar == '^') { /* okay to increment charclass_start even if already > 1 */ charclass_start++; } else { /* otherwise (including case of leading ']') */ charclass_start = 3; /* definitely past the start */ } > Perhaps s/charclass_depth/bracket_depth/ would be a good idea. Wouldn't object to that. Maybe we can think of a better name for "charclass_start" too --- that sounds like a boolean condition. The best I'm coming up with right now is "charclass_count", but that's not that helpful. regards, tom lane
On Fri, 2025-09-12 at 21:53 -0400, Tom Lane wrote: > > Ah, got it. But this logic definitely deserves more comments. > What do you think of something like > > if (pchar == ']' && charclass_start > 2) > { > /* found the real end of a bracket pair */ > charclass_depth--; > /* past start of outer brackets, so keep charclass_start > 2 */ > } > else if (pchar == '[') > { > /* start of a nested bracket pair */ > charclass_depth++; > /* leading ^ or ] in this context is not special */ > charclass_start = 3; > } > else if (pchar == '^') > { > /* okay to increment charclass_start even if already > 1 */ > charclass_start++; > } > else > { > /* otherwise (including case of leading ']') */ > charclass_start = 3; /* definitely past the start */ > } > > > Perhaps s/charclass_depth/bracket_depth/ would be a good idea. > > Wouldn't object to that. Maybe we can think of a better name for > "charclass_start" too --- that sounds like a boolean condition. > The best I'm coming up with right now is "charclass_count", but > that's not that helpful. I came up with the attached patch set. 0001 is the actual bug fix: in addition to my previous patch, I fixed two more cases in which a closing bracket might not be recognized as ending the character class (one is from your patch above). I think that these could not lead to bad query results, but I am not certain. 0002 is the cosmetic improvement: I renamed "charclass_depth" to "bracket_depth" and "bracket_depth" to "bracket_pos", rewrote the "else if" cascade as you suggested above and put some love into additional comments. I used two separate patches for clarity and ease of review, but both should get backpatched. Yours, Laurenz Albe
Вложения
Laurenz Albe <laurenz.albe@cybertec.at> writes: > I came up with the attached patch set. I did some more work on the comments, adjusted a couple of places that could be simplified, and pushed it. > I used two separate patches for clarity and ease of review, but both > should get backpatched. I didn't really love the "fix it and then explain it afterward" approach. It's hard to review a patch if you don't understand the logic. I considered swapping the order of the two patches, but eventually just merged them into one. regards, tom lane
On Sat, 2025-09-13 at 17:00 -0400, Tom Lane wrote: > Laurenz Albe <laurenz.albe@cybertec.at> writes: > > I came up with the attached patch set. > > I did some more work on the comments, adjusted a couple of places that > could be simplified, and pushed it. Thank you! > > I used two separate patches for clarity and ease of review, but both > > should get backpatched. > > I didn't really love the "fix it and then explain it afterward" > approach. It's hard to review a patch if you don't understand the > logic. I considered swapping the order of the two patches, but > eventually just merged them into one. Yes, having the refactoring patch first might have been better. Yours, Laurenz Albe
On Sat, Sep 13, 2025 at 05:00:20PM -0400, Tom Lane wrote: > Laurenz Albe <laurenz.albe@cybertec.at> writes: > > I came up with the attached patch set. > > I did some more work on the comments, adjusted a couple of places that > could be simplified, and pushed it. My apologies for the silence here. The timing of the events is interesting. I have been pinged about this issue on Friday evening/night time here: https://www.postgresql.org/message-id/202509121631.dtn5kw5er2m5@alvherre.pgsql For a fix applied as of cdf7feb96562 roughly 24 hours after this ping. The timing was a bit bad for me, because I was already gone for what was a long weekend in Japan, just back today. So it was a bit hard for me to look at anything > I didn't really love the "fix it and then explain it afterward" > approach. It's hard to review a patch if you don't understand the > logic. I considered swapping the order of the two patches, but > eventually just merged them into one. Merging both things makes sense as well here. Ugh, yes. That was wrong. With the fix, '%' is translated: - Filter: (f1 ~ '^(?:[\a].*)$'::text) Before the fix, not translated. + Filter: (f1 ~ '^(?:[\a]%)$'::text) Thanks for the report, the analysis, and the commit. -- Michael