Обсуждение: BUG #2554: ILIKE operator works incorrectly

Поиск
Список
Период
Сортировка

BUG #2554: ILIKE operator works incorrectly

От
"JarosÅaw Bojar"
Дата:
The following bug has been logged online:

Bug reference:      2554
Logged by:          Jarosław Bojar
Email address:      jarek.bojar@gmail.com
PostgreSQL version: 8.1.4
Operating system:   i386-redhat-linux-gnu, compiled by GCC
i386-redhat-linux-gcc (GCC) 4.1.0
Description:        ILIKE operator works incorrectly
Details:

ILIKE operator works incorrectly with UTF8 encoding and Polish characters.
Consider following SQL statements:

CREATE DATABASE test ENCODING='UTF8';
\c test
CREATE TABLE the_table (val VARCHAR(50));
INSERT INTO the_table (val) VALUES ('Świat');
INSERT INTO the_table (val) VALUES ('Łąka');
INSERT INTO the_table (val) VALUES ('Ćma');
INSERT INTO the_table (val) VALUES ('abc');
INSERT INTO the_table (val) VALUES ('ABC');

Without Polish characters ILIKE works correctly:
SELECT * FROM the_table WHERE val ilike 'abc';
 val
-----
 abc
 ABC
(2 rows)

But with Polish characters it does not work correctly. Following queries
should give single row results, but they do not return any rows:

SELECT * FROM the_table WHERE val ilike 'świat';
 val
-----
(0 rows)

SELECT * FROM the_table WHERE val ilike 'łąka';
 val
-----
(0 rows)

SELECT * FROM the_table WHERE val ilike 'ćma';
 val
-----
(0 rows)

On the contrary functions like UPPER work correctly with Polish characters
and following queries produce correct results:

SELECT * FROM the_table WHERE UPPER(val) like UPPER('świat');
  val
-------
 Świat
(1 row)

 SELECT * FROM the_table WHERE UPPER(val) like UPPER('ćma');
 val
-----
 Ćma
(1 row)

SELECT * FROM the_table WHERE UPPER(val) like UPPER('łąka');
 val
------
 Łąka
(1 row)

The bug is also present in PostgreSQL 8.1.0 on Windows XP.

Re: BUG #2554: ILIKE operator works incorrectly

От
Peter Eisentraut
Дата:
JarosÅaw Bojar wrote:
> ILIKE operator works incorrectly with UTF8 encoding and Polish
> characters.

What does SHOW lc_collate say?

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

Re: BUG #2554: ILIKE operator works incorrectly

От
Tom Lane
Дата:
Peter Eisentraut <peter_e@gmx.net> writes:
> JarosÅaw Bojar wrote:
>> ILIKE operator works incorrectly with UTF8 encoding and Polish
>> characters.

> What does SHOW lc_collate say?

Even if the locale is set right, iwchareq() is completely broken
for multibyte charsets: it's trying to apply the system's tolower()
to a pg_wchar, which is highly unlikely to work.  This has been a
known problem for awhile, but no one's stepped up to fix it.

            regards, tom lane