Re: ERROR: could not convert UTF-8 character 0x00ef to ISO8859-1 possiblesolution
От | Anders Hermansen |
---|---|
Тема | Re: ERROR: could not convert UTF-8 character 0x00ef to ISO8859-1 possiblesolution |
Дата | |
Msg-id | 20050427115434.GB30285@online.no обсуждение исходный текст |
Ответ на | ERROR: could not convert UTF-8 character 0x00ef to ISO8859-1 possiblesolution (Mauricio Hernández Durán <mhernandez@ingenian.com>) |
Ответы |
Re: ERROR: could not convert UTF-8 character 0x00ef to ISO8859-1 possiblesolution
Re: ERROR: could not convert UTF-8 character 0x00ef to ISO8859-1 possiblesolution |
Список | pgsql-jdbc |
* Guillaume Cottenceau (gc@mnc.ch) wrote: > Anders Hermansen <anders 'at' yoyo.no> writes: > > * Guillaume Cottenceau (gc@mnc.ch) wrote: > > > Isn't there a problem with your UTF-8 data containing 0x00EF? > > > > E0 to EF hex (224 to 239): first byte of a three-byte sequence. > > Well 00 is first byte here, isn't it? UTF-8 is a byte sequence, so it's not about the first byte in the whole sequence. But about the first byte in a tree byte sequece. There should be no nul (0) bytes when encoding UTF-8. I believe this is in the specification to allow it to be compatible with C nul-terminated strings. I believe that the byte sequence 0x00EF i illegal UTF-8 because: 1) It contains nul (0x00) byte 2) 0xEF is not followed by two more bytes On the other hand U+00EF is a valid unicode code point. Which points to: LATIN SMALL LETTER I WITH DIAERESIS It is encoded as 0xC3AF in UTF-8 As 0x00EF in UTF-16 (and UCS-2 ?) As 0xEF in ISO-8859-1 Anders Hermansen
В списке pgsql-jdbc по дате отправления: