Re: REGEXP_MATCHES() strange behavior with '^' and '$' pattern
От | Jeevan Chalke |
---|---|
Тема | Re: REGEXP_MATCHES() strange behavior with '^' and '$' pattern |
Дата | |
Msg-id | CAM2+6=WtoQkTv_vyTNrDw-PC=XheFz8KK1Ng6udS+nK8JAfMCg@mail.gmail.com обсуждение исходный текст |
Ответ на | REGEXP_MATCHES() strange behavior with '^' and '$' pattern (Jeevan Chalke <jeevan.chalke@enterprisedb.com>) |
Ответы |
Re: REGEXP_MATCHES() strange behavior with '^' and '$' pattern
|
Список | pgsql-hackers |
Oops forgot patch.
Attached now.
--
Jeevan B Chalke
Attached now.
On Wed, Jul 31, 2013 at 6:03 PM, Jeevan Chalke <jeevan.chalke@enterprisedb.com> wrote:
Hi,
While playing with regular expression I found some strange behavior of
regexp_matches() function.
Consider following sql query and its output:
postgres=# select regexp_matches('1' || chr(10) || '2' || chr(10) || '3' || chr(10) || '4', '^', 'mg');
regexp_matches
----------------
{""}
{""}
{""}
{""}
{""}
{""}
{""}
(7 rows)
It suppose to return me 4 rows and not 7. Similar behavior found with
pattern '$'.
It seems that these start and end anchor characters are not matching
correctly. Or rather they are matching twice.
To get a root cause of it, I put elog(INFO,..) into the
setup_regexp_matches() function where we copy matches into the struct and
found following values.
postgres=# select regexp_matches('1' || chr(10) || '2' || chr(10) || '3' || chr(10) || '4', '^', 'mg');
INFO: start_search: 0 rm_so: 0 rm_eo: 0
INFO: updated start_search: 1
INFO: start_search: 1 rm_so: 2 rm_eo: 2
INFO: updated start_search: 2
INFO: start_search: 2 rm_so: 2 rm_eo: 2
INFO: updated start_search: 3
INFO: start_search: 3 rm_so: 4 rm_eo: 4
INFO: updated start_search: 4
INFO: start_search: 4 rm_so: 4 rm_eo: 4
INFO: updated start_search: 5
INFO: start_search: 5 rm_so: 6 rm_eo: 6
INFO: updated start_search: 6
INFO: start_search: 6 rm_so: 6 rm_eo: 6
INFO: updated start_search: 7
Certainly, after second pass, updated start_search should be 3 as last
matched pattern was at 2 and of zero length since so = eo.
I have modified that logic to look similar as that of replace_text_regexp()
function. As regexp_replace works well.
Attached patch with test-case. Please have a look and let me know if I
assumed something wrong.
Thanks
--
Jeevan B Chalke
--
Jeevan B Chalke
Вложения
В списке pgsql-hackers по дате отправления: