Re: lc_collate issue
От | Tatsuo Ishii |
---|---|
Тема | Re: lc_collate issue |
Дата | |
Msg-id | 20070825.111811.35679240.t-ishii@sraoss.co.jp обсуждение исходный текст |
Ответ на | Re: lc_collate issue (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-general |
> Cody Pisto <cpisto@rvweb.com> writes: > > If initdb was done with a C locale, and thus lc_collate and friends > > where all C, but the database and client encoding was set to UTF-8, > > would postgres convert data on the fly from UTF-8(storage) to ASCII for > > sorting or would things just blow up when a >1 byte character hit the mix? > > No, C locale just sorts the bytes. It won't "blow up". Whether it will > give you a sort ordering you like for multibyte characters is a > different question. Yup. For example, LATIN1 part of UTF-8 (UNICODE) is physicaly ordered same as ISO 8859-1. So if you see the order of ISO 8859-1 is "natural", then the sort order of UTF-8 is ok as well. However the order of CJK part of UTF-8 is totally different from the original charcater sets (almost random), you need to use convert() for converting UTF-8 to original encoding to get "natural" sort order. I don't think you are interested in CJK part, though. -- Tatsuo Ishii SRA OSS, Inc. Japan
В списке pgsql-general по дате отправления: