Re: A rough roadmap for internationalization fixes
От | Kurt Roeckx |
---|---|
Тема | Re: A rough roadmap for internationalization fixes |
Дата | |
Msg-id | 20031125181336.GA13791@ping.be обсуждение исходный текст |
Ответ на | Re: A rough roadmap for internationalization fixes (Tatsuo Ishii <t-ishii@sra.co.jp>) |
Ответы |
Re: A rough roadmap for internationalization fixes
|
Список | pgsql-hackers |
On Tue, Nov 25, 2003 at 08:40:57PM +0900, Tatsuo Ishii wrote: > > On Tue, 25 Nov 2003, Peter Eisentraut wrote: > > > > I've always thought unicode was enough to even represent Japanese. Then > > the client encoding can be something else that we can convert to. In any > > way, the encoding of the message catalog has to be known to the system so > > it can be converted to the correct encoding for the client. > > I'm tired of telling that Unicode is not that perfect. Maybe it should be explained what the problems really are, instead of saying it "isn't perfect"? From what I understand there is only a problem converting from the "legacy" encoding to unicode, and the other way around, and no problem if you stop doing the conversion. The conversion problem is because what in an encoding is only represented by 1 character can be several characters in unicode. Some examples people might understand are: - µ: In iso 8859-1 it's char 0xB5. In unicode it can be U+00B5 (micro sign) or U+03BC (greek letter small mu) - Å: ISO 8859-1: 0xC5. Unicode U+00C5 (latin capital letter a with ring above) or U+212B (angstrom sign) - The ohm sign vs the greek letter omega. - Quotation marks: You have left double quote, right double quote, and a few others. > Another gottcha > with Unicode is the UTF-8 encoding (currently we use) consumes 3 > bytes for each Kanji character, while other encodings consume only 2 > bytes. IMO 3/2 storage ratio could not be neglected for database use. You can encode unicode in different ways, and UTF-8 is only one of them. Is there a problem with using UCS-2 (except that it would require more storage for ASCII)? Kurt
В списке pgsql-hackers по дате отправления: