Re: Lower case
От | Dawid Kuroczko |
---|---|
Тема | Re: Lower case |
Дата | |
Msg-id | 758d5e7f05012701442fddc68f@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Lower case ("Vladimir S. Petukhov" <vladimir@sycore.org>) |
Список | pgsql-general |
On Thu, 27 Jan 2005 00:16:14 +0000, Vladimir S. Petukhov <vladimir@sycore.org> wrote: > > > LC_COLLATE: ru_RU > > > LC_CTYPE: ru_RU > > > Name | Owner | Encoding > > > -----------+----------+---------- > > > testdb | postgres | UNICODE > > > And LIKE, ILIKE, ~ do not recognize upper/lower case.. > > > > What character encoding is implied by those LC_ settings on your machine? > > If it's different from the database encoding (here utf8) these things > > won't actually work right. > LANG=ru_RU.koi8r > LC_ALL=ru_RU.koi8r > But how it act on lower/upper cases? Client use utf-8 encoding... The client uses utf-8 encoding, so does server. Texts are stored using UTF-8. However when you call a lower() function from PostgreSQL it does more or less following: -- it retrieves text row from database. This text is in UTF-8 encoding. -- it calls strxfrm function upon this text. -- strxfrm function sees that current locale is ru_RU.koi8r -- strxfrm then takes utf-8 encoded text and treats it as koi8r -- strxfrm "skips over" characters it does not recognize (utf-8 chars) -- strxfrm returns transformed text -- PostgreSQL takes the resulting text, believing it is still in utf-8. In other words, probably only latin characters were subject to lower() functions, any "unknown" Russian UTF-8 characters were at best skipped. Please note that PostgreSQL does not do implicit utf8->koi8r->utf8 conversion while calling function lower(). AFAIK it does not even know (or care) if current locale setting ("ru_RU") is for different encoding than current database's. It is DB Admin's duty to make sure cluster locale (done in initdb) is compatible with database encoding (done in createdb). Regards, Dawid
В списке pgsql-general по дате отправления: