Re: ICU integration

Поиск

Список

Период

Сортировка

От	Peter Geoghegan
Тема	Re: ICU integration
Дата	12 сентября 2016 г. 18:04:13
Msg-id	CAM3SWZQ1uSbrVqmQAqLCtTTrM4Q47=9QByJALKnsyPAxdxJbcw@mail.gmail.com обсуждение исходный текст
Ответ на	Re: ICU integration (Dave Page <dpage@pgadmin.org>)
Список	pgsql-hackers

Дерево обсуждения

On Fri, Sep 9, 2016 at 6:39 AM, Dave Page <dpage@pgadmin.org> wrote:
> Looking back at my old emails, apparently ICU 5.0 and later include
> ucol_strcollUTF8() which avoids the need to convert UTF-8 characters
> to 16 bit before sorting. RHEL 6 has the older 4.2 version of ICU.

At the risk of stating the obvious, there is a reason why ICU
traditionally worked with UTF-16 natively. It's the same reason why
many OSes and application frameworks (e.g., Java) use UTF-16
internally, even though UTF-8 is much more popular on the web. Which
is: there are certain low-level optimizations possible that are not
possible with UTF-8.

I'm not saying that it would be just as good if we were to not use the
UTF-8 optimized stuff that ICU now has. My point is that it's not
useful to prejudge whether or not performance will be acceptable based
on a factor like this, which is ultimately just an implementation
detail. The ICU patch either performs acceptably as a substitute for
something like glibc, or it does not.

-- 
Peter Geoghegan

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: ICU integration