Обсуждение: BUG #15046: non-greedy ignored

Поиск
Список
Период
Сортировка

BUG #15046: non-greedy ignored

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      15046
Logged by:          Bob Gailer
Email address:      bgailer@gmail.com
PostgreSQL version: 10.1
Operating system:   windows 10
Description:

I start psql; enter:

postgres=# select regexp_replace('a(d)s(e)f', '\(.*?\)', '', 'g');
 regexp_replace
----------------
 asf
(1 row)

Works as expected. Then I add |q to the pattern, and the .*? becomes
greedy!

postgres=# select regexp_replace('a(d)s(e)f', '\(.*?\)|q', '', 'g');
 regexp_replace
----------------
 af
(1 row)


Re: BUG #15046: non-greedy ignored

От
"David G. Johnston"
Дата:
On Friday, February 2, 2018, PG Bug reporting form <noreply@postgresql.org> wrote:
The following bug has been logged on the website:

Bug reference:      15046
Logged by:          Bob Gailer
Email address:      bgailer@gmail.com
PostgreSQL version: 10.1
Operating system:   windows 10
Description:

I start psql; enter:

postgres=# select regexp_replace('a(d)s(e)f', '\(.*?\)', '', 'g');
 regexp_replace
----------------
 asf
(1 row)

Works as expected. Then I add |q to the pattern, and the .*? becomes
greedy!

postgres=# select regexp_replace('a(d)s(e)f', '\(.*?\)|q', '', 'g');
 regexp_replace
----------------
 af
(1 row)


This seems to be explained by the final greediness rule:


  • An RE consisting of two or more branches connected by the | operator is always greedy.


    David J.

Re: BUG #15046: non-greedy ignored

От
Tom Lane
Дата:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> On Friday, February 2, 2018, PG Bug reporting form <noreply@postgresql.org>
> wrote:
>> Works as expected. Then I add |q to the pattern, and the .*? becomes
>> greedy!

> This seems to be explained by the final greediness rule:
> https://www.postgresql.org/docs/10/static/functions-matching.html#POSIX-MATCHING-RULES
>    An RE consisting of two or more branches connected by the | operator is
>    always greedy.

Yeah.  That subsection also contains some useful advice about how to
control greediness decisions --- in this case, wrapping the whole
thing with (...){1,1}? might do what you want.

The short answer, perhaps, is that non-greedy patterns are not
standardized by POSIX and you shouldn't expect that all regex
engines do them the same way.  Ours is definitely different
from Perl's, for example.

            regards, tom lane


Re: BUG #15046: non-greedy ignored

От
Bob Gailer
Дата:

Thanks! Rtfp, eh?


On Feb 2, 2018 8:48 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:
"David G. Johnston" <david.g.johnston@gmail.com> writes:
> On Friday, February 2, 2018, PG Bug reporting form <noreply@postgresql.org>
> wrote:
>> Works as expected. Then I add |q to the pattern, and the .*? becomes
>> greedy!

> This seems to be explained by the final greediness rule:
> https://www.postgresql.org/docs/10/static/functions-matching.html#POSIX-MATCHING-RULES
>    An RE consisting of two or more branches connected by the | operator is
>    always greedy.

Yeah.  That subsection also contains some useful advice about how to
control greediness decisions --- in this case, wrapping the whole
thing with (...){1,1}? might do what you want.

The short answer, perhaps, is that non-greedy patterns are not
standardized by POSIX and you shouldn't expect that all regex
engines do them the same way.  Ours is definitely different
from Perl's, for example.

                        regards, tom lane