Re: Bug in UTF8-Validation Code?
От | Tatsuo Ishii |
---|---|
Тема | Re: Bug in UTF8-Validation Code? |
Дата | |
Msg-id | 20070405.065206.51277520.t-ishii@sraoss.co.jp обсуждение исходный текст |
Ответ на | Re: Bug in UTF8-Validation Code? (Mark Dilger <pgsql@markdilger.com>) |
Список | pgsql-hackers |
> Tatsuo Ishii wrote: > > > <SNIP>. I think we need to continute design discussion, probably > > targetting for 8.4, not 8.3. > > The discussion came about because Andrew - Supernews noticed that chr() > returns invalid utf8, and we're trying to fix all the bugs with invalid > utf8 in the system. Something needs to be done, even if we just check > the result of the current chr() implementation and throw an error on > invalid results. But do we want to make this minor change for 8.3 and > then change it again for 8.4? My opinion was in the snipped part by you in the previous mail -- Limiting chr() to ASCII range -- Tatsuo Ishii SRA OSS, Inc. Japan > Here's an example of the current problem. It's an 8.2.3 database with > utf8.en_US encoding > > > mark=# create table testutf8 (t text); > CREATE TABLE > mark=# insert into testutf8 (t) (select chr(gs) from > generate_series(0,255) as gs); > INSERT 0 256 > mark=# \copy testutf8 to testutf8.data > mark=# truncate testutf8; > TRUNCATE TABLE > mark=# \copy testutf8 from testutf8.data > ERROR: invalid byte sequence for encoding "UTF8": 0x80 > HINT: This error can also happen if the byte sequence does not match > the encoding expected by the server, which is controlled by > "client_encoding". > CONTEXT: COPY testutf8, line 129 > > > > ---------------------------(end of broadcast)--------------------------- > TIP 9: In versions below 8.0, the planner will ignore your desire to > choose an index scan if your joining column's datatypes do not > match
В списке pgsql-hackers по дате отправления: