Re: improve Chinese locale performance
От | Robert Haas |
---|---|
Тема | Re: improve Chinese locale performance |
Дата | |
Msg-id | CA+TgmoaaGa3HyYmMKFgc4m2Cps8Vv7L8534-89D_AaA5YS1CqA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: improve Chinese locale performance (Martijn van Oosterhout <kleptog@svana.org>) |
Список | pgsql-hackers |
On Sun, Jul 28, 2013 at 5:39 AM, Martijn van Oosterhout <kleptog@svana.org> wrote: > On Tue, Jul 23, 2013 at 10:34:21AM -0400, Robert Haas wrote: >> I pretty much lost interest in ICU upon reading that they use UTF-16 >> as their internal format. >> >> http://userguide.icu-project.org/strings#TOC-Strings-in-ICU > > The UTF-8 support has been steadily improving: > > For example, icu::Collator::compareUTF8() compares two UTF-8 strings > incrementally, without converting all of the two strings to UTF-16 if > there is an early base letter difference. > > http://userguide.icu-project.org/strings/utf-8 > > For all other encodings you should be able to use an iterator. As to > performance I have no idea. > > The main issue with strxfrm() is its lame API. If it supported > returning prefixes you'd be set, but as it is you need >10MB of memory > just to transform a 10MB string, even if only the first few characers > would be enough to sort... Yep, definitely. And by ">10MB" you mean ">90MB", at least on my Mac, which is really outrageous. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления: