Re: [E] Re: Regexp_replace bug / does not terminate on long strings
От | Markhof, Ingolf |
---|---|
Тема | Re: [E] Re: Regexp_replace bug / does not terminate on long strings |
Дата | |
Msg-id | CALZg0g5VBbzjY3G=TTFM4E8Yd0-vEAQftkQ=8PGkdbwE-TbUHA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [E] Re: Regexp_replace bug / does not terminate on long strings (Francisco Olarte <folarte@peoplecall.com>) |
Список | pgsql-general |
You are right, I also found the same behaviour when using e.g the UNIX sed command.
Ingolf
On Mon, Aug 23, 2021 at 4:24 PM Francisco Olarte <folarte@peoplecall.com> wrote:
Ingolf:
On Mon, Aug 23, 2021 at 2:39 PM Markhof, Ingolf
<ingolf.markhof@de.verizon.com> wrote:
> Yes, When I use (\1)? instead of (\1)+, the expression is evaluated quickly, but it doesn't return what I want. Once a word is written, it is not subject to matching again. i.e.
> select regexp_replace( --> remove double entries
> 'one,one,one,two,two,three,three',
> '([^,]+)(,\1)?($|,)',
> '\1\3',
> 'g'
> ) as res;
>
...
> Honestly, this behaviour seems to be incorrect for me. Once the system replaces the first two 'one,one,' by a single 'one,', I'd expect to match this replaced one 'one,' with the next 'one,' following, replacing these two by another, single 'one,', again...
I think your expectation is misguided. All the regexp engines I've
used do it this way, when asked to match "g"lobally they do
non-overlapping matches, they do not substitute and recurse with the
modified string.
Also, your way opens the door to run-away or infinite loops (
rr('a','a','aa','g') or rr('a','a','a','g'), not to speak of
r('x','','','g') ). Even a misguided r(str, '_+','_','g'), used
sometimes to normalize space runs and similar things, can go into a
loop.
Francisco Olarte.
Verizon Deutschland GmbH - Sebrathweg 20, 44149 Dortmund, Germany - Amtsgericht Dortmund, HRB 14952 - Geschäftsführer: Detlef Eppig - Vorsitzender des Aufsichtsrats: Francesco de Maio
В списке pgsql-general по дате отправления: