Re: [HACKERS] Re: locales and MB (was: Postgres 6.5 beta2 and beta3 problem)
От | Tatsuo Ishii |
---|---|
Тема | Re: [HACKERS] Re: locales and MB (was: Postgres 6.5 beta2 and beta3 problem) |
Дата | |
Msg-id | 199906111514.AAA00712@ext16.sra.co.jp обсуждение исходный текст |
Ответ на | Re: [HACKERS] Re: locales and MB (was: Postgres 6.5 beta2 and beta3 problem) (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: [HACKERS] Re: locales and MB (was: Postgres 6.5 beta2 and beta3 problem)
|
Список | pgsql-hackers |
> Tatsuo Ishii <t-ishii@sra.co.jp> writes: > > Currently the mb support allows serveral internal > > encodings including Unicode and mule-internal-code. > > (yes, you can do regexp/like to Unicode data if mb support is > > enabled). > > One of the things that bothers me about makeIndexable() is that it > doesn't seem to be multibyte-aware; does it really work in MB case? Yes. This is because I carefully choose multibyte encodings for the backend that have following characteristics: o if the 8th bit of a byte is off then it is a ascii character o otherwise it is part of non ascii multibyte characters With these assumptions, makeIndexable() works very well with multibyte chars. Not all multibyte encodings satisfy above conditions. For example, SJIS (an encoding for Japanese) and Big5 (for traditional Chinese) does not satisfies those requirements. In these encodings the first byte of the double byte is always 8th bit on. However in second byte sometimes 8th bit is off: this means we cannot distinguish it from ascii since it may accidentally matches a bit pattern of an ascii char. This is why I do not allow SJIS and Big5 as the server encodings. Users can use SJIS and Big5 for the client encoding, however. You might ask why I don't make makeIndexable() multibyte-aware. It definitely possible. But you should know there are many places that need to be multibyte-aware in this sence. The parser is one of the good example. Making everything in the backend multibyte-aware is not worse to do, in my opinion. --- Tatsuo Ishii
В списке pgsql-hackers по дате отправления: