Re: BUG #17584: SQL crashes PostgreSQL when using ICU collation

Поиск

Список

Период

Сортировка

От	Peter Geoghegan
Тема	Re: BUG #17584: SQL crashes PostgreSQL when using ICU collation
Дата	14 августа 2022 г. 16:17:12
Msg-id	CAH2-WznXGos_OJ03wJANo_2S_9oVLiJE9E5p7ii6Y_jSpGnkfA@mail.gmail.com обсуждение исходный текст
Ответ на	Re: BUG #17584: SQL crashes PostgreSQL when using ICU collation (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы	Re: BUG #17584: SQL crashes PostgreSQL when using ICU collation
Список	pgsql-bugs

Дерево обсуждения

On Sun, Aug 14, 2022 at 7:25 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> When I build with --with-icu and run coverage testing on the core
> regression tests under LANG=en_US.utf8, I see that most of
> varstr_abbrev_convert() is reached, but *not* the two buggy
> buffer-enlargement stanzas.  So that explains why we've not seen this
> in testing.  I wonder whether there is a reasonably cheap way to test
> that.  The submitted test case is clearly out of the question to add
> as a regression test...

I don't think that it would be terribly difficult. I notice that the
similar "TEXTBUFLEN isn't big enough" code path in varstr_cmp()
already has test coverage. As does the corresponding point in the
equivalent SortSupport-based comparator, varstrfastcmp_locale().

Oh, hang on -- I see why it's tricky to test, now that I took a fresh look
the code. I see that ucol_nextSortKeyPart()'s interface (used only
when the DB encoding is UTF-8) allows us to only request however many
bytes of the conditioned binary key that we need, which for us is
always just the first sizeof(Datum) bytes. That's probably another
factor that made this bug hard to reach; I suspect that
ucol_nextSortKeyPart() has a tendency to avoid needing very much
scratch space to produce a usable abbreviated key.

I also suspect that the test case happens to tickle some obscure UCA
implementation detail in just the right/wrong way, where (for whatever
reason) it is necessary for the implementation to use a fairly large
buffer, despite the fact that it knows that its varlena.c caller will
only require enough conditioned binary key bytes to build an 8 byte
abbreviated key. It might be very rare and hard to hit (and/or depend
on the ICU version), which would explain why it took this long to hear
any complaints. So...I think that it might be quite difficult to test
this.

BTW, is the plan to get rid of the questionable coding pattern in both
varstr_abbrev_convert() and in varstrfastcmp_locale()? I am in favor
of just using repalloc() across the board.

--
Peter Geoghegan

В списке pgsql-bugs по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: BUG #17584: SQL crashes PostgreSQL when using ICU collation