Re: UTF8 encoding problem
От | Giorgio Valoti |
---|---|
Тема | Re: UTF8 encoding problem |
Дата | |
Msg-id | B03E7678-25C9-4D9D-8805-42F59A88E515@mac.com обсуждение исходный текст |
Ответ на | Re: UTF8 encoding problem (Michael Fuhr <mike@fuhr.org>) |
Список | pgsql-general |
On 18/giu/08, at 15:00, Michael Fuhr wrote: > On Wed, Jun 18, 2008 at 08:25:07AM +0200, Giorgio Valoti wrote: >> On 18/giu/08, at 03:04, Michael Fuhr wrote: >>> Is the data UTF-8? If the error is 'invalid byte sequence for >>> encoding "UTF8": 0xa3' then you probably need to set client_encoding >>> to latin1, latin9, or win1252. >> >> Why? > > UTF-8 has rules about what byte values can occur in sequence; > violations of those rules mean that the data isn't valid UTF-8. > This particular error says that the database received a byte with > the value 0xa3 (163) in a sequence of bytes that wasn't valid UTF-8. > > The UTF-8 byte sequence for the pound sign (£) is 0xc2 0xa3. If > Garry got this error (I don't know if he did; I was asking) then > the byte 0xa3 must have appeared in some other sequence that wasn't > valid UTF-8. The usual reason for that is that the data is in some > encoding other than UTF-8. > > Common encodings for Western European languages are Latin-1 > (ISO-8859-1), Latin-9 (ISO-8859-15), and Windows-1252. All three > of these encodings use a lone 0xa3 to represent the pound sign. If > the data has a pound sign as 0xa3 and the database complains that > it isn't part of a valid UTF-8 sequence then the data is likely to > be in one of these other encodings. Much clearer now, thank you Michael. -- Giorgio Valoti
В списке pgsql-general по дате отправления: