Re: C locale versus en_US.UTF8. (Was: String comparision in PostgreSQL)
От | Scott Marlowe |
---|---|
Тема | Re: C locale versus en_US.UTF8. (Was: String comparision in PostgreSQL) |
Дата | |
Msg-id | CAOR=d=0taNujCoFDQ6g=n-yLD2xOfnE0dsu8AykgNqBK5+gQcQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: C locale versus en_US.UTF8. (Was: String comparision in PostgreSQL) (Bruce Momjian <bruce@momjian.us>) |
Ответы |
Re: C locale versus en_US.UTF8. (Was: String comparision
in PostgreSQL)
|
Список | pgsql-general |
On Wed, Aug 29, 2012 at 11:43 AM, Bruce Momjian <bruce@momjian.us> wrote: > On Wed, Aug 29, 2012 at 10:31:21AM -0700, Aleksey Tsalolikhin wrote: >> On Wed, Aug 29, 2012 at 9:45 AM, Merlin Moncure <mmoncure@gmail.com> wrote: >> > citext unfortunately doesn't allow for index optimization of LIKE >> > queries, which IMNSHO defeats the whole purpose. to the best way >> > remains to use lower() ... >> > this will be index optimized and fast as long as you specified C >> > locale for your database. >> >> What is the difference between C and en_US.UTF8, please? We see that >> the same query (that invokes a sort) runs 15% faster under the C >> locale. The output between C and en_US.UTF8 is identical. We're >> considering moving our database from en_US.UTF8 to C, but we do deal >> with internationalized text. > > Well, C has reduced overhead for string comparisons, but obviously > doesn't work well for international characters. The single-byte > encodings have somewhat less overhead than UTF8. You can try using C > locales for databases that don't require non-ASCII characters. I think you're confusing encodings with locales. C is a locale. You can have a database with a locale of C and UTF-8 encoding. create database clocale_utf8 encoding='UTF8' LC_COLLATE= 'C' template=template0; \l Name | Owner | Encoding | Collate | Ctype | Access privileges --------------+----------+-----------+-------------+-------------+----------------------- clocale_utf8 | smarlowe | UTF8 | C | en_US.UTF-8 | SQL_ASCII is the encoding equivalent of C locale, but it also allows multi-byte characters.
В списке pgsql-general по дате отправления: