Re: Our regex vs. POSIX on "longest match"
От | Robert Haas |
---|---|
Тема | Re: Our regex vs. POSIX on "longest match" |
Дата | |
Msg-id | CA+TgmoZ7n3Nh3DwDULDdX7Yt=MMfqpmmf+1JAznPt1N+gB=5Eg@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Our regex vs. POSIX on "longest match" (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-hackers |
On Mon, Mar 5, 2012 at 11:28 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> I think the right way to imagine this is as though the regular >> expression were being matched to the source text in left-to-right >> fashion. > > No, it isn't. You are headed down the garden path that leads to a > Perl-style definition-by-implementation, and in particular you are going > to end up with an implementation that fails to satisfy the POSIX > standard. POSIX requires an *overall longest* match (at least for cases > where all quantifiers are greedy), and that sometimes means that the > quantifiers can't be processed strictly left-to-right greedy. An > example of this is > > regression=# select substring('aaaaaabab' from '(a*(ab)*)'); > substring > ----------- > aaaaaabab > (1 row) > > If the a* is allowed to match as much as it wants, the (ab)* will not be > able to match at all, and then you fail to find the longest possible > overall match. Oh. Right. > I suspect that it is possible to construct similar cases where, for an > all-non-greedy pattern, finding the overall shortest match sometimes > requires that individual quantifiers eat more than the local minimum. > I've not absorbed enough caffeine yet this morning to produce an example > though. Probably true. I guess, then, that the issue here is that there isn't really any principled way to decide whether the RE overall should be greedy or non-greedy. And similarly with every sub-RE. The problem with the "non-greedy" quantifiers is that they apply only to the quantified bit specifically, which leaves us guessing as to the user's intent with regards to everything else. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления: