Обсуждение: BUG #2261: ILIKE seems to be buggy on koi8 input
The following bug has been logged online:
Bug reference: 2261
Logged by: Evgeny Gridasov
Email address: eugrid@fpm.kubsu.ru
PostgreSQL version: 8.1.2
Operating system: Debian Linux
Description: ILIKE seems to be buggy on koi8 input
Details:
my terminal is RU_ru.KOI8-R,
template1's encoding is UTF8.
ILIKE seems to be buggy when comparing russian strings,
while UPPER/LOWER works OK.
template1=# \encoding koi8;
try to get uppercase of some russian letters:
template1=# select upper('ÑÑва');
upper
-------
ФЫÐÐ
(1 row)
result is OK!
next, try to compare uppercase and lowercase using
ILIKE:
template1=# select true where 'ÑÑва' ilike 'ФЫÐÐ';
bool
------
(0 rows)
OOPS! Nothing happened. But why?
try the same but with latin charset letters:
template1=# select true where 'asdf' ilike 'ASDF';
bool
------
t
(1 row)
Try to compare lowercase with lowercase (russian):
template1=# select true where 'ÑÑва' ilike 'ÑÑва';
bool
------
t
(1 row)
it works.
"Evgeny Gridasov" <eugrid@fpm.kubsu.ru> writes:
> my terminal is RU_ru.KOI8-R,
> template1's encoding is UTF8.
> ILIKE seems to be buggy when comparing russian strings,
> while UPPER/LOWER works OK.
I'll bet that the database's locale setting is expecting some encoding
other than UTF8 :-(. You need to have compatible locale and encoding
settings inside the database. You didn't say exactly what the database
LC_COLLATE value is, but if it's RU_ru.KOI8-R, that definitely does not
match UTF8.
regards, tom lane
Evgeny Gridasov <eugrid@fpm.kubsu.ru> writes:
> postgresql server starts with environment:
> LC_COLLATE=en_US.UTF-8
> LC_ALL=en_US.UTF-8
> LANG=en_US.UTF-8
Well, that setting shouldn't translate much except A-Z/a-z. If you want
cyrillic upper/lower case conversions you need database's LC_CTYPE to be
ru_RU.something.
regards, tom lane
postgresql server starts with environment: LC_COLLATE=en_US.UTF-8 LC_ALL=en_US.UTF-8 LANG=en_US.UTF-8 I've tried to set different LC_COLLATE/LC_ALL/LANG settings but it did not help. I've tried to change my psql input to unicode russian, but it did not help, too. 'show all' says I've got lc_collate and other lc_* set to en_US.UTF-8. initdb was run with this locale. It cannot be modified setting it in postgresql.conf (creation db constant?) Should I reinit database to get this working or what? If I should reinit db, what locale should I choose? BTW, ~* syntax does not also work with upper/lower case russian letters, while upper()/lower() still work ok. On Wed, 15 Feb 2006 12:44:18 -0500 Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Evgeny Gridasov" <eugrid@fpm.kubsu.ru> writes: > > my terminal is RU_ru.KOI8-R, > > template1's encoding is UTF8. > > ILIKE seems to be buggy when comparing russian strings, > > while UPPER/LOWER works OK. > > I'll bet that the database's locale setting is expecting some encoding > other than UTF8 :-(. You need to have compatible locale and encoding > settings inside the database. You didn't say exactly what the database > LC_COLLATE value is, but if it's RU_ru.KOI8-R, that definitely does not > match UTF8. > > regards, tom lane -- Evgeny Gridasov Software Engineer I-Free, Russia
Evgeny Gridasov wrote: > It cannot be modified setting it in postgresql.conf (creation db > constant?) Should I reinit database to get this working or what? Yes. > If I should reinit db, what locale should I choose? Something like ru_RU.utf8. -- Peter Eisentraut http://developer.postgresql.org/~petere/