Re: Future of our regular expression code
От | Jay Levitt |
---|---|
Тема | Re: Future of our regular expression code |
Дата | |
Msg-id | 4F41E39B.8010502@gmail.com обсуждение исходный текст |
Ответ на | Re: Future of our regular expression code (Stephen Frost <sfrost@snowman.net>) |
Ответы |
Re: Future of our regular expression code
|
Список | pgsql-hackers |
Stephen Frost wrote: > Alright, I'll bite.. Which existing regexp implementation that's well > written, well maintained, and which is well protected against malicious > regexes should we be considering then? FWIW, there's a benchmark here that compares a number of regexp engines, including PCRE, TRE and Russ Cox's RE2: http://lh3lh3.users.sourceforge.net/reb.shtml The fastest backtracking-style engine seems to be oniguruma, which is native to Ruby 1.9 and thus not only supports Unicode but I'd bet performs pretty well on it, on account of it's developed in Japan. But it goes pathological on regexen containing '|'; the only safe choice among PCRE-style engines is RE2, but of course that doesn't support backreferences. Russ's page on re2 (http://code.google.com/p/re2/) says: "If you absolutely need backreferences and generalized assertions, then RE2 is not for you, but you might be interested in irregexp, Google Chrome's regular expression engine." That's here: http://blog.chromium.org/2009/02/irregexp-google-chromes-new-regexp.html Sadly, it's in Javascript. Seems like if you need a safe, performant regexp implementation, your choice is (a) finish PLv8 and support it on all platforms, or (b) add backreferences to RE2 and precompile it to C with Comeau (if that's still around), or... Jay
В списке pgsql-hackers по дате отправления: