Re: BUG #13538: REGEX non-greedy is working incorrectly (and also greedy matches fail if non-greedy is present)
От | David G. Johnston |
---|---|
Тема | Re: BUG #13538: REGEX non-greedy is working incorrectly (and also greedy matches fail if non-greedy is present) |
Дата | |
Msg-id | CAKFQuwbn0nYSQL99rn=WSsfKYrSra5cd3GiQ3iH_rnHHGic1_g@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: BUG #13538: REGEX non-greedy is working incorrectly (and also greedy matches fail if non-greedy is present) (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: BUG #13538: REGEX non-greedy is working incorrectly (and also greedy matches fail if non-greedy is present)
|
Список | pgsql-bugs |
On Tue, Aug 4, 2015 at 8:39 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I wrote: > > As David says, these examples appear to be following what's stated in > > > http://www.postgresql.org/docs/9.4/static/functions-matching.html#POSIX-M= ATCHING-RULES > > The Spencer regex engine we use has a notion of greediness or > > non-greediness of the entire regex, and further that that takes > precedence > > for determining the overall match length over greediness of individual > > subexpressions. That behavior might be inconvenient for this particula= r > > use-case, but that doesn't make it a bug. > > BTW, perhaps it would be worth adding an example to that section that > shows how to control this behavior. The trick is obvious once you've see= n > it, but not so much otherwise: you add something to the start of the rege= x > that establishes the overall greediness you want, but can never actually > match any characters. "\0*" or "\0*?" will work fine in Postgres > use-cases since there can never be a NUL character in the data. > > regression=3D# select regexp_matches('abc01234xyz', '(.*)(\d+)(.*)'); > regexp_matches > ----------------- > {abc0123,4,xyz} > (1 row) > > regression=3D# select regexp_matches('abc01234xyz', '(.*?)(\d+)(.*)'); > regexp_matches > ---------------- > {abc,0,""} > (1 row) > > regression=3D# select regexp_matches('abc01234xyz', '\0*(.*?)(\d+)(.*)'); > regexp_matches > ----------------- > {abc,01234,xyz} > (1 row) > > =E2=80=8B+1 David J.=E2=80=8B
В списке pgsql-bugs по дате отправления: