bug?!: lower() breaks unicode
От | Holger Klawitter |
---|---|
Тема | bug?!: lower() breaks unicode |
Дата | |
Msg-id | 200311251322.09770.lists@klawitter.de обсуждение исходный текст |
Список | pgsql-general |
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi there, I have a database with encoding UNICODE, when I run lower() on a column with non ascii code the code for "sharp s" gets garbled. This happens with postgres 7.3.[234] under Linux (Debian woody and SuSE 7.2). I have made sure that LC_CTYPE=C for the server: create table t ( a text, b text ); \encoding latin1 // my terminal is latin1 insert into t (a) values( 'Fuß' ); update t set b = lower(a); select * from t; ERROR: Could not convert UTF-8 to ISO8859-1 Apparantly the utf-8 special codes get lowercased, as the following selects yield different results: \encoding unicode // show me all select a from t select b from t Fuà // select a from t fu // select b from t, should be "fuÃ" The JDBC code breaks even more baldy. java.lang.ArrayIndexOutOfBoundsException: 9 at org.postgresql.core.Encoding.decodeUTF8(Encoding.java:254) - From the release docs of 7.4 it does not seems that this issue has not been adressed. The database has been initialized to "de_DE@euro", but this shouldn't matter, should it? Mit freundlichem Gruß / With kind regards Holger Klawitter - -- lists <at> klawitter <dot> de -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.2 (GNU/Linux) iD8DBQE/w0lx1Xdt0HKSwgYRAieWAJ9GHr/CAmh7mXYrM99LNzYimQa+qgCeIlKR D0+YgVkdlbQtXbAEd9T8/eE= =NO8s -----END PGP SIGNATURE-----
В списке pgsql-general по дате отправления: