Re: [HACKERS] Re: COPY doesn't works when containing ' ' or ' ' characters on db
От | Tatsuo Ishii |
---|---|
Тема | Re: [HACKERS] Re: COPY doesn't works when containing ' ' or ' ' characters on db |
Дата | |
Msg-id | 20010228100134Q.t-ishii@sra.co.jp обсуждение исходный текст |
Ответ на | Re: COPY doesn't works when containing ' ' or ' ' characters on db (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-admin |
> "Oliver Elphick" <olly@lfix.co.uk> writes: > > I think this happens when the front-end encoding is SQL_ASCII and the > > database is using UNICODE. Then, there are misunderstandings between > > front-end and back-end, so that a single character with the eighth bit > > set may be sent by the front-end and interpreted by the back-end as the > > first half of a UNICODE two-byte character. > > I wondered about that, but his examples had one or more characters > between the eighth-bit-set character and the '|', so this doesn't seem > to explain the problem. No. From Jaume's example: > SELECT edicion FROM products; > edicion > ----------------- > Espa�a|Nacional <-------puts on the same cell either there's an '|' in > the middle!!! \361 == 0xf1. UTF-8 assumes that: if (the first byte) & 0xe0 == 0xe0, then the letter consists of 3 bytes. So PostgreSQL believes that "�a|" is one UTF-8 letter and eat up '|'. My guess is Jaume made an UNICODE database but provided it ISO 8859-1 or that kind of single-byte latin encoding data. I'm wondering why so many people are using UTF-8 database even he does not understand what UTF-8 is:-) I hope 7.1 would solve this kind of confusion by enabling an automatic encoding conversion between UTF-8 and others. -- Tatsuo Ishii
В списке pgsql-admin по дате отправления: