Re: BUG #19341: REPLACE() fails to match final character when using nondeterministic ICU collation
| От | Tom Lane |
|---|---|
| Тема | Re: BUG #19341: REPLACE() fails to match final character when using nondeterministic ICU collation |
| Дата | |
| Msg-id | 640025.1764774727@sss.pgh.pa.us обсуждение исходный текст |
| Ответ на | Re: BUG #19341: REPLACE() fails to match final character when using nondeterministic ICU collation (Laurenz Albe <laurenz.albe@cybertec.at>) |
| Ответы |
Re: BUG #19341: REPLACE() fails to match final character when using nondeterministic ICU collation
|
| Список | pgsql-bugs |
Laurenz Albe <laurenz.albe@cybertec.at> writes:
> On Tue, 2025-12-02 at 15:53 -0500, Tom Lane wrote:
>> Looking at the code overall, I wonder if the outer loop doesn't have
>> the same issue. The comments claim that we should be able to handle
>> zero-length matches, but if the overall haystack is of length zero,
>> we will fail to check for such a match.
> If you can find zero-length matches at all, you could find a
> zero-length match in a non-empty haystack. Perhaps the function is
> never called with an empty haystack...
After further thought, it seems to me that this comment is an
unjustified extrapolation from what Peter actually said, which was
that the match substring could be physically shorter than the needle.
Which is certainly true, for instance case-folding or accent-stripping
might shorten the string. But it doesn't follow that a nonempty
needle could ever match an empty substring; and that does not seem
like it could be sane behavior to me. We're considering string
comparison here, not regexes.
We do require callers to eliminate the empty-needle case, and given
that I think we could assume that match substrings must be at least
1 byte long. That assumption is what justifies the current API for
these functions, and perhaps we can also simplify this loop by
using it.
regards, tom lane
В списке pgsql-bugs по дате отправления: