Re: Bug in UTF8-Validation Code?
От | Andrew - Supernews |
---|---|
Тема | Re: Bug in UTF8-Validation Code? |
Дата | |
Msg-id | slrnf14mfc.2i67.andrew+nonews@atlantis.supernews.net обсуждение исходный текст |
Ответ на | Bug in UTF8-Validation Code? (Mario Weilguni <mweilguni@sime.com>) |
Список | pgsql-hackers |
On 2007-04-03, "Albe Laurenz" <all@adv.magwien.gv.at> wrote: > According to RFC 2279, the Euro, > Unicode code point 0x20AC = 0010 0000 1010 1100, > will be encoded to 1110 0010 1000 0010 1010 1100 = 0xE282AC. > > IMHO this is the only good and intuitive way for CHR() and ASCII(). It is beyond ludicrous for functions like chr() or ascii() to convert a Euro sign to 0xE282AC rather than 0x20AC. "Intuitive"? There is _NO SUCH THING_ as 0xE282AC as a representation of a Unicode character - there is either the code point, 0x20AC (which is a _number_), or the sequences of _bytes_ that represent that code point in various encodings, of which the three-byte sequence 0xE2 0x82 0xAC is the one used in UTF-8. Functions like chr() and ascii() should be dealing with the _number_ of the code point, not with its representation in transfer encodings. -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services
В списке pgsql-hackers по дате отправления: