Re: Another regexp performance improvement: skip useless paren-captures

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Another regexp performance improvement: skip useless paren-captures
Дата
Msg-id 3581981.1628544376@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Another regexp performance improvement: skip useless paren-captures  (Mark Dilger <mark.dilger@enterprisedb.com>)
Ответы Re: Another regexp performance improvement: skip useless paren-captures  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-hackers
Mark Dilger <mark.dilger@enterprisedb.com> writes:
> I can still trigger the old bug for which we thought we'd pushed a fix.  The test case below crashes on master
(e12694523e7e4482a052236f12d3d8b58be9a22c),and also on the fixed version "Make regexp engine's backref-related
compilationstate more bulletproof." (cb76fbd7ec87e44b3c53165d68dc2747f7e26a9a). 

> Can you test if it crashes for you, too?  I'm not sure I see why this one fails when millions of others pass.

> The backtrace is still complaining about regc_nfa.c:1265:

> +select regexp_split_to_array('', '(?:((?:q+))){0}(\1){0,0}?*[^]');
> +server closed the connection unexpectedly

Hmmm ... yeah, I see it too.  This points up something I'd wondered
about before, which is whether the code that "cancels everything"
after detecting {0} is really OK.  It throws away the outer subre
*and children* without worrying about what might be inside, and
here we see that that's not good enough --- there's still a v->subs
pointer to the first capturing paren set, which we just deleted,
so that the \1 later on messes up.  I'm not sure why the back
branches are managing not to crash, but that might just be a memory
management artifact.

            regards, tom lane



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Geoghegan
Дата:
Сообщение: Re: ECPG bug fix: DECALRE STATEMENT and DEALLOCATE, DESCRIBE
Следующее
От: Melanie Plageman
Дата:
Сообщение: Re: Asynchronous and "direct" IO support for PostgreSQL.