REGEXP_MATCHES() strange behavior with '^' and '$' pattern
От | Jeevan Chalke |
---|---|
Тема | REGEXP_MATCHES() strange behavior with '^' and '$' pattern |
Дата | |
Msg-id | CAM2+6=U6_WxwoDn=UOL7PdadRyAZYV6QLox-FJJwrkTEZS5RJg@mail.gmail.com обсуждение исходный текст |
Ответы |
Re: REGEXP_MATCHES() strange behavior with '^' and '$' pattern
|
Список | pgsql-hackers |
<div dir="ltr">Hi,<br /><br />While playing with regular expression I found some strange behavior of<br />regexp_matches()function.<br /><br />Consider following sql query and its output:<br /><br /><font size="1"><span style="font-family:couriernew,monospace">postgres=# select regexp_matches('1' || chr(10) || '2' || chr(10) || '3' || chr(10)|| '4', '^', 'mg');<br /> regexp_matches <br />----------------<br /> {""}<br /> {""}<br /> {""}<br /> {""}<br /> {""}<br/> {""}<br /> {""}<br />(7 rows)</span></font><br /><br />It suppose to return me 4 rows and not 7. Similar behaviorfound with<br /> pattern '$'.<br /><br />It seems that these start and end anchor characters are not matching<br/>correctly. Or rather they are matching twice.<br /><br />To get a root cause of it, I put elog(INFO,..) intothe<br />setup_regexp_matches() function where we copy matches into the struct and<br /> found following values.<br /><br/><br /><font size="1"><span style="font-family:courier new,monospace">postgres=# select regexp_matches('1' || chr(10)|| '2' || chr(10) || '3' || chr(10) || '4', '^', 'mg');<br /> INFO: start_search: 0 rm_so: 0 rm_eo: 0<br />INFO: updated start_search: 1<br />INFO: start_search: 1 rm_so: 2 rm_eo: 2<br />INFO: updated start_search: 2<br />INFO: start_search: 2 rm_so: 2 rm_eo: 2<br />INFO: updated start_search: 3<br /> INFO: start_search: 3 rm_so: 4 rm_eo:4<br />INFO: updated start_search: 4<br />INFO: start_search: 4 rm_so: 4 rm_eo: 4<br />INFO: updated start_search:5<br />INFO: start_search: 5 rm_so: 6 rm_eo: 6<br />INFO: updated start_search: 6<br /> INFO: start_search:6 rm_so: 6 rm_eo: 6<br />INFO: updated start_search: 7</span></font><br /><br />Certainly, after second pass,updated start_search should be 3 as last<br />matched pattern was at 2 and of zero length since so = eo.<br /><br />Ihave modified that logic to look similar as that of replace_text_regexp()<br />function. As regexp_replace works well.<br/><br />Attached patch with test-case. Please have a look and let me know if I<br />assumed something wrong.<br /><br/>Thanks<br /><br />-- <br />Jeevan B Chalke<br /><br /></div>
В списке pgsql-hackers по дате отправления: