Re: Pathological regexp match
От | Michael Glaesemann |
---|---|
Тема | Re: Pathological regexp match |
Дата | |
Msg-id | B53EF768-0C04-4D75-BE6A-1BC2934E0500@myyearbook.com обсуждение исходный текст |
Ответ на | Re: Pathological regexp match (Alvaro Herrera <alvherre@commandprompt.com>) |
Ответы |
Re: Pathological regexp match
Re: Pathological regexp match |
Список | pgsql-hackers |
On Jan 28, 2010, at 21:59 , Alvaro Herrera wrote: > Hi Michael, > > Michael Glaesemann wrote: >> We came across a regexp that takes very much longer than expected. >> >> PostgreSQL 8.4.1 on x86_64-unknown-linux-gnu, compiled by GCC gcc >> (GCC) 4.1.2 20080704 (Red Hat 4.1.2-44), 64-bit >> >> SELECT 'ooo...' ~ $r$Z(Q)[^Q]*A.*?(\1)$r$; -- omitted for email >> brevity > > The ? after .* is pointless. Interesting. I would expect that *? would be the non-greedy version of *, meaning match up to the first \1 (in this case the first Q following A), rather than as much as possible. For example, in Perl: $ perl -e " if ('oooZQoooAoooQooQooQooo' =~ /Z(Q)[^Q]*A.*(\1)/) { print \$&; } else { print 'NO'; }" && echo ZQoooAoooQooQooQ $ perl -e " if ('oooZQoooAoooQooQooQooo' =~ /Z(Q)[^Q]*A.*?(\1)/) { print \$&; } else { print 'NO'; }" && echo ZQoooAoooQ If I'm reading the docs right, Postgres does support non-greedy * as *?: <http://www.postgresql.org/docs/8.4/interactive/functions-matching.html#POSIX-QUANTIFIERS-TABLE > However, as you point out, Postgres doesn't appear to take this into account: postgres=# select regexp_replace('oooZQoooAoooQooQooQooo', $r$(Z(Q) [^Q]*A.*(\2))$r$, $s$X$s$); regexp_replace ---------------- oooXooo (1 row) postgres=# select regexp_replace('oooZQoooAoooQooQooQooo', $r$(Z(Q) [^Q]*A.*?(\2))$r$, $s$X$s$); regexp_replace ---------------- oooXooo (1 row) Michael Glaesemann michael.glaesemann@myyearbook.com
В списке pgsql-hackers по дате отправления: