bug?!: lower() breaks unicode

Поиск

Список

Период

Сортировка

От	Holger Klawitter
Тема	bug?!: lower() breaks unicode
Дата	25 ноября 2003 г. 08:22:52
Msg-id	200311251322.09770.lists@klawitter.de обсуждение исходный текст
Список	pgsql-general

Дерево обсуждения

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi there,

I have a database with encoding UNICODE, when I run lower() on a column with
non ascii code the code for "sharp s" gets garbled. This happens with
postgres 7.3.[234] under Linux (Debian woody and SuSE 7.2). I have made sure
that LC_CTYPE=C for the server:

 create table t ( a text, b text );
 \encoding latin1 // my terminal is latin1
 insert into t (a) values( 'Fuß' );
 update t set b = lower(a);
 select * from t;

ERROR:  Could not convert UTF-8 to ISO8859-1

Apparantly the utf-8 special codes get lowercased, as the following selects
yield different results:

 \encoding unicode // show me all
 select a from t
 select b from t

FuÃ     // select a from t
fu      // select b from t, should be "fuÃ"


The JDBC code breaks even more baldy.

java.lang.ArrayIndexOutOfBoundsException: 9
      at org.postgresql.core.Encoding.decodeUTF8(Encoding.java:254)

- From the release docs of 7.4 it does not seems that this issue has not been
adressed. The database has been initialized to "de_DE@euro", but this
shouldn't matter, should it?

Mit freundlichem Gruß / With kind regards
    Holger Klawitter
- --
lists <at> klawitter <dot> de
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/w0lx1Xdt0HKSwgYRAieWAJ9GHr/CAmh7mXYrM99LNzYimQa+qgCeIlKR
D0+YgVkdlbQtXbAEd9T8/eE=
=NO8s
-----END PGP SIGNATURE-----

В списке pgsql-general по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

bug?!: lower() breaks unicode