Re: 9.5.3: substring: regex greedy operator not picking up chars as expected
От | Tom Lane |
---|---|
Тема | Re: 9.5.3: substring: regex greedy operator not picking up chars as expected |
Дата | |
Msg-id | 4424.1471268506@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | 9.5.3: substring: regex greedy operator not picking up chars as expected ("Foster, Russell" <Russell.Foster@crl.com>) |
Список | pgsql-bugs |
"Foster, Russell" <Russell.Foster@crl.com> writes: > For the following query: > select substring('>772' from '.*?[0-9]+') > I would expect the output to be '>772', but it is '>7'. As David pointed out, that's what you get because the RE as a whole is considered to be non-greedy, ie you get the shortest overall match. However, you can adjust that by decorating the RE: # select substring('>772' from '(.*?[0-9]+){1,1}'); substring ----------- >772 (1 row) Now it's longest-overall, but the .*? part is still shortest-match, so it doesn't consume any digits. However, I suspect that still is not quite what you want, because it consumes too much in cases like: # select substring('>772foo444' from '(.*?[0-9]+){1,1}'); substring ------------ >772foo444 (1 row) There's probably really no way out of that except to be less lazy about writing the pattern: # select substring('>772foo444' from '([^0-9]*?[0-9]+){1,1}'); substring ----------- >772 (1 row) and in that formulation, of course, greediness doesn't really matter because there is only one way to match. # select substring('>772foo444' from '[^0-9]*[0-9]+'); substring ----------- >772 (1 row) See https://www.postgresql.org/docs/9.5/static/functions-matching.html#POSIX-MATCHING-RULES regards, tom lane
В списке pgsql-bugs по дате отправления: