Multibyte characters handling bug in varchar()
От | Edward |
---|---|
Тема | Multibyte characters handling bug in varchar() |
Дата | |
Msg-id | 001d01c227ce$a28ed0d0$06c6a8c0@ncvillas.com обсуждение исходный текст |
Список | pgsql-bugs |
Hello, I am using Postgresql 7.1 on Linux platform (RedHat 7.1). My database encoding is 'EUC_CN'. The application is accessing database with PG JDBC2.0. I had define a field in a table like: create table test1 ( id integer default not null, memo varchar(128) ); The memo field is for user to record some comment or alike. They input Chin= ese (GB2312 or GBK encoding) mixed with ASCII. Problem happens when: The length of the input string is larger than 128, and the 128th and 129th= byte consists of a Chinese character (you know Chinese characters use two = bytes in GB2312 or GBK encoding). The problem is: The insert query will be running well without any error. But the getString = method will get a zero length String from the field. More complications: When I pg_dump the database and restore it, the scripts produced by pg_dump= (with -D flag, which means dump with attribute) can not be restored. When = I check the scripts I found that the memo field of this record is dumped wi= thout the ending single quote (this is because the 128th byte and the singl= e quote followed acutally consists of another unrecognized chinese characte= r) and that is why it failed to be restored. Below is the dump for this record: INSERT INTO=20 "test1" ("id","memo") VALUES (5,'=A8=B0=A8=B0=A1=C1??=A1=EC=A8=AA???=A6=CC?= =A8=BA?=A8=B0=A8=A4=A8=A6??=A8=AEGH=A6=CC=A3=A4?a??=A8=AC????=A8=A4?=A1=C2= =A8=B0a=A8=BA?5??1=A8=A8??=A8=A23=A8=A8?=A8=B0=A8=B0=A6=CC=A8=B2=A8=B0??=A8= =AE?a?=A8=AC=A1=E32??=A8=A2?=A8=B0???D??=A1=C01=A1=E8?=A3=A4=A8=AC?=A8=A2??= ?=A8=AC=A8=AC=A1=EA?=A8=AA?=A6=CC?=A8=BA=A1=C0????=A1=C1=A1=E9=A8=B0a?1?=A8= =A6??=A8=BA1=A8=B0=A6=CC?=A1=C2=A8=AA???=A8=B0?=A8=B0a?=A8=AE?'); I feel the Multibyte is not properly handled in this case. Looking forward = to hearing from dev team. Finally I think PostgreSQL is an excellent database, but the name postgresq= l seems very difficult to pronounce and it is probably one obstacle prevent= ing people knowing more about it. Thanks for the hardworking of the dev team, you have done excellent work! Best Regards,
В списке pgsql-bugs по дате отправления: