Re: Patch for collation using ICU
От | Tatsuo Ishii |
---|---|
Тема | Re: Patch for collation using ICU |
Дата | |
Msg-id | 20050508.090845.39153917.t-ishii@sra.co.jp обсуждение исходный текст |
Ответ на | Re: Patch for collation using ICU ("John Hansen" <john@geeknet.com.au>) |
Список | pgsql-hackers |
> Bruce Momjian wrote: > > > > There are two reasons for that optimization --- first, some > > locale support is broken and Unicode encoding with a C locale > > crashes (not an issue for ICU), and second, it is an > > optimization for languages like Japanese that want to use > > unicode, but don't need a locale because upper/lower means > > nothing in those character sets. > > No, upper/lower means nothing in those languages, so why would you need > to optimize upper/lower if they're not used?? > And if they are, it's obviously because the text contains characters > from other languages (probably english) and as such they should behave > correctly. Yes, Japanese (and probably Chinese and Korean) languages include ASCII character. More precisely ASCII is part of Japanese encodings(LATIN1 is not, however). And we have no problem at all with glibc/C locale. See below("unitest" is an UNICODE database). unitest=# create table t1(t text); CREATE TABLE unitest=# \encoding EUC_JP unitest=# insert into t1 values('abcあいう'); INSERT 1842628 1 unitest=# select upper(t) from t1; upper -----------ABCあいう (1 row) So Japanese(including ASCII)/UNICODE behavior is perfectly correct at this moment. So I strongly object removing that optimization. -- Tatsuo Ishii
В списке pgsql-hackers по дате отправления: