Re: invalidly encoded strings
От | Martijn van Oosterhout |
---|---|
Тема | Re: invalidly encoded strings |
Дата | |
Msg-id | 20070910160805.GF16512@svana.org обсуждение исходный текст |
Ответ на | Re: invalidly encoded strings (Tatsuo Ishii <ishii@postgresql.org>) |
Список | pgsql-hackers |
On Tue, Sep 11, 2007 at 12:30:51AM +0900, Tatsuo Ishii wrote: > Why do you think that employing the Unicode code point as the chr() > argument could avoid endianness issues? Are you going to represent > Unicode code point as UCS-4? Then you have to specify the endianness > anyway. (see the UCS-4 standard for more details) Because the argument to chr() is an integer, which has no endian-ness. You only get into endian-ness if you look at how you store the resulting string. > Also I'd like to point out all encodings has its own code point > systems as far as I know. For example, EUC-JP has its corresponding > code point systems, ASCII, JIS X 0208 and JIS X 0212. So I don't see > we can't use "code point" as chr()'s argument for othe encodings(of > course we need optional parameter specifying which character set is > supposed). Oh, the last discussion on this didn't answer this question. Is there a standard somewhere that maps integers to characters in EUC-JP. If so, how can I find out what character 512 is? Have a nice day, -- Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate.
В списке pgsql-hackers по дате отправления: