Re: ICU integration
От | Peter Geoghegan |
---|---|
Тема | Re: ICU integration |
Дата | |
Msg-id | CAM3SWZQ1uSbrVqmQAqLCtTTrM4Q47=9QByJALKnsyPAxdxJbcw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: ICU integration (Dave Page <dpage@pgadmin.org>) |
Список | pgsql-hackers |
On Fri, Sep 9, 2016 at 6:39 AM, Dave Page <dpage@pgadmin.org> wrote: > Looking back at my old emails, apparently ICU 5.0 and later include > ucol_strcollUTF8() which avoids the need to convert UTF-8 characters > to 16 bit before sorting. RHEL 6 has the older 4.2 version of ICU. At the risk of stating the obvious, there is a reason why ICU traditionally worked with UTF-16 natively. It's the same reason why many OSes and application frameworks (e.g., Java) use UTF-16 internally, even though UTF-8 is much more popular on the web. Which is: there are certain low-level optimizations possible that are not possible with UTF-8. I'm not saying that it would be just as good if we were to not use the UTF-8 optimized stuff that ICU now has. My point is that it's not useful to prejudge whether or not performance will be acceptable based on a factor like this, which is ultimately just an implementation detail. The ICU patch either performs acceptably as a substitute for something like glibc, or it does not. -- Peter Geoghegan
В списке pgsql-hackers по дате отправления: