Re: BUG #17584: SQL crashes PostgreSQL when using ICU collation
От | Peter Geoghegan |
---|---|
Тема | Re: BUG #17584: SQL crashes PostgreSQL when using ICU collation |
Дата | |
Msg-id | CAH2-WznXGos_OJ03wJANo_2S_9oVLiJE9E5p7ii6Y_jSpGnkfA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: BUG #17584: SQL crashes PostgreSQL when using ICU collation (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: BUG #17584: SQL crashes PostgreSQL when using ICU collation
|
Список | pgsql-bugs |
On Sun, Aug 14, 2022 at 7:25 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > When I build with --with-icu and run coverage testing on the core > regression tests under LANG=en_US.utf8, I see that most of > varstr_abbrev_convert() is reached, but *not* the two buggy > buffer-enlargement stanzas. So that explains why we've not seen this > in testing. I wonder whether there is a reasonably cheap way to test > that. The submitted test case is clearly out of the question to add > as a regression test... I don't think that it would be terribly difficult. I notice that the similar "TEXTBUFLEN isn't big enough" code path in varstr_cmp() already has test coverage. As does the corresponding point in the equivalent SortSupport-based comparator, varstrfastcmp_locale(). Oh, hang on -- I see why it's tricky to test, now that I took a fresh look the code. I see that ucol_nextSortKeyPart()'s interface (used only when the DB encoding is UTF-8) allows us to only request however many bytes of the conditioned binary key that we need, which for us is always just the first sizeof(Datum) bytes. That's probably another factor that made this bug hard to reach; I suspect that ucol_nextSortKeyPart() has a tendency to avoid needing very much scratch space to produce a usable abbreviated key. I also suspect that the test case happens to tickle some obscure UCA implementation detail in just the right/wrong way, where (for whatever reason) it is necessary for the implementation to use a fairly large buffer, despite the fact that it knows that its varlena.c caller will only require enough conditioned binary key bytes to build an 8 byte abbreviated key. It might be very rare and hard to hit (and/or depend on the ICU version), which would explain why it took this long to hear any complaints. So...I think that it might be quite difficult to test this. BTW, is the plan to get rid of the questionable coding pattern in both varstr_abbrev_convert() and in varstrfastcmp_locale()? I am in favor of just using repalloc() across the board. -- Peter Geoghegan
В списке pgsql-bugs по дате отправления: