Bug #659: lower()/upper() bug on ->multibyte<- DB
От | pgsql-bugs@postgresql.org |
---|---|
Тема | Bug #659: lower()/upper() bug on ->multibyte<- DB |
Дата | |
Msg-id | 20020507145112.BE39A476356@postgresql.org обсуждение исходный текст |
Ответы |
Re: Bug #659: lower()/upper() bug on ->multibyte<- DB
|
Список | pgsql-bugs |
Michael Enke (michael.enke@wincor-nixdorf.com) reports a bug with a severity of 2 The lower the number the more severe it is. Short Description lower()/upper() bug on ->multibyte<- DB Long Description OS: Linux Kernel 2.4.4, PostgreSQL version 7.2.1 lower() and upper() doesn't work like expected for multibyte databases. It is working fine for one-byte encoding. The behaviour can be reproduced as follows: at initdb: LC_CTYPE was set to de_DE createdb -E UTF-8 name export PGCLIENTENCODING=LATIN1 psql -U name -------------------------------------------------- => select lower('Ä'); -- german umlaut A, capital ERROR: Could not convert UTF-8 to ISO8859-1 -- I expected to see: ä german umlaut a, lower case -------------------------------------------------- => select lower('ä'); -- german umlaut a, lower case ERROR: Could not convert UTF-8 to ISO8859-1 -- I expected to see: ä german umlaut a, lower case -------------------------------------------------- => select upper('ä'); -- it doesn't translate ä -- I expected to see: Ä -------------------------------------------------- => select upper('Ä'); -- this works fine Ä -------------------------------------------------- The same happens to Ö and Ü (O umlaut, U umlaut) If you want to reproduce this and don't have ä/Ä on your keyboard, you can create a table with one column, type varchar(1) (on a MB DB). create a file with following input: ae is \u00e4 AE is \u00c4 from java use the command: native2ascii -reverse -utf8 <this-file> <new-file> In <new-file> you will see: in the first line 2 bytes: A(with tilde on top) and Euro Symbol, in the second line 2 byte: A(with tilde on top) and a dotted box unset PGCLIENTENCODING, call psql: insert into table values('<copy and paste first two bytes>'); insert into table values('<copy and paste second two bytes>'); export PGCLIENTENCODING=LATIN1 psql: select * from table; will show you the a-umlaut and A-umlaut. Sample Code No file was uploaded with this report
В списке pgsql-bugs по дате отправления: