Re: [HACKERS] ICU integration
От | Peter Geoghegan |
---|---|
Тема | Re: [HACKERS] ICU integration |
Дата | |
Msg-id | CAH2-Wzkt_bf5yBeoNMB95NG40tg_UNNvCpYLzhmAoNijG3LMPA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [HACKERS] ICU integration (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>) |
Ответы |
Re: [HACKERS] ICU integration
|
Список | pgsql-hackers |
On Wed, Feb 15, 2017 at 9:17 PM, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > I have changed it to use ucol_nextSortKeyPart() when appropriate. I think it would be fine if the second last argument was "sizeof(Datum)", not "Min(sizeof(Datum), sss->buflen2)" here: > + if (GetDatabaseEncoding() == PG_UTF8) > + { > + UCharIterator iter; > + uint32_t state[2]; > + UErrorCode status; > + > + uiter_setUTF8(&iter, sss->buf1, len); > + state[0] = state[1] = 0; /* won't need that again */ > + status = U_ZERO_ERROR; > + bsize = ucol_nextSortKeyPart(sss->locale->info.icu.ucol, > + &iter, > + state, > + (uint8_t *) sss->buf2, > + Min(sizeof(Datum), sss->buflen2), > + &status); > + if (U_FAILURE(status)) > + ereport(ERROR, > + (errmsg("sort key generation failed: %s", u_errorName(status)))); > + } Does this really need to be in the strxfrm() loop at all? I don't see why you need to ever retry here. There is an issue of style here: > - /* Just like strcoll(), strxfrm() expects a NUL-terminated string */ > memcpy(sss->buf1, authoritative_data, len); > + /* Just like strcoll(), strxfrm() expects a NUL-terminated string. > + * Not necessary for ICU, but doesn't hurt. */ I see that you have preserved strcoll() comparison caching (or should I say ucol_strcollUTF8() comparison caching?), at the cost of having to keep around a buffer which we must continue to copy every text string into within varstr_abbrev_convert(). That was probably the right trade-off. It doesn't hurt that it required minimal divergence within new ICU code, too. -- Peter Geoghegan
В списке pgsql-hackers по дате отправления: