Re: Unicode problems on IRC
От | Andrew - Supernews |
---|---|
Тема | Re: Unicode problems on IRC |
Дата | |
Msg-id | slrnd5jsg1.2ilg.andrew+nonews@trinity.supernews.net обсуждение исходный текст |
Ответ на | Re: Unicode problems on IRC ("John Hansen" <john@geeknet.com.au>) |
Список | pgsql-hackers |
On 2005-04-10, "John Hansen" <john@geeknet.com.au> wrote: > That's right, dono how I missed that one, but looks correct to me, and > is in line with the code in ConvertUTF.c from unicode.org, on which I > based the patch, extended to support 6 byte utf8 characters. Frankly, you should probably de-extend it back down to 4 bytes. That's enough to encode the Unicode range of 0x000000 - 0x10FFFF, and enough other stuff would break if anyone allocated a character outside that range that I don't think it it worth worrying about. (Even the ISO people have agreed to conform to that limitation.) Even if insanity struck simultaneously at both standards bodies, 4 bytes is enough to go to 0x1FFFFF so there is still substantial slack. (A number of other specifications based on utf-8 have removed the 5 and 6 byte sequences too, so there is substantial precedent for this.) -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services
В списке pgsql-hackers по дате отправления: