Re: Support LIKE with nondeterministic collations

Поиск

Список

Период

Сортировка

От	Heikki Linnakangas
Тема	Re: Support LIKE with nondeterministic collations
Дата	11 ноября 2024 г. 16:25:29
Msg-id	bce1da9a-2975-462f-8946-c58c878ba82c@iki.fi обсуждение исходный текст
Ответ на	Re: Support LIKE with nondeterministic collations ("Daniel Verite" <daniel@manitou-mail.org>)
Список	pgsql-hackers

Дерево обсуждения

On 04/11/2024 10:26, Peter Eisentraut wrote:
> On 29.10.24 18:15, Jacob Champion wrote:
>> libfuzzer is unhappy about the following code in MatchText:
>>
>>> +            while (p1len > 0)
>>> +            {
>>> +                if (*p1 == '\\')
>>> +                {
>>> +                    found_escape = true;
>>> +                    NextByte(p1, p1len);
>>> +                }
>>> +                else if (*p1 == '_' || *p1 == '%')
>>> +                    break;
>>> +                NextByte(p1, p1len);
>>> +            }
>>
>> If the pattern ends with a backslash, we'll call NextByte() twice,
>> p1len will wrap around to INT_MAX, and we'll walk off the end of the
>> buffer. (I fixed it locally by duplicating the ERROR case that's
>> directly above this.)
> 
> Thanks.  Here is an updated patch with that fixed.

Sadly the algorithm is O(n^2) with non-deterministic collations.Is there 
any way this could be optimized? We make no claims on how expensive any 
functions or operators are, so I suppose a slow implementation is 
nevertheless better than throwing an error.

Let's at least add some CHECK_FOR_INTERRUPTS(). For example, this takes 
a very long time and is uninterruptible:

  SELECT repeat('x', 100000) LIKE '%xxxy%' COLLATE ignore_accents;

-- 
Heikki Linnakangas
Neon (https://neon.tech)

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Support LIKE with nondeterministic collations