Re: Sigh, LIKE indexing is *still* broken in foreign locales
От | Erich Stamberger |
---|---|
Тема | Re: Sigh, LIKE indexing is *still* broken in foreign locales |
Дата | |
Msg-id | Pine.LNX.4.21.0006090101230.6349-100000@gewi.kfunigraz.ac.at обсуждение исходный текст |
Ответ на | Sigh, LIKE indexing is *still* broken in foreign locales (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: Sigh, LIKE indexing is *still* broken in foreign locales
|
Список | pgsql-hackers |
On Wed, 7 Jun 2000, Tom Lane wrote: > Moucha Václav <MouchaV@Radiomobil.cz> writes: > > 1. Compilation > > ./configure --enable-locale # not needed for RPMS precompiled binaries > > > 2. Starting postmaster > > export LC_CTYPE=cs_CZ > > export LC_COLLATE=cs_CZ # this setting is important for the > > bug result > > postmaster -S -D /home/pgsql/data -o '-Fe' > > > 3. SQL steps > > create table test (name text); > > insert into test values ('á'); # the first char is E1 from LATIN 2 > > coding > > insert into test values ('áb'); > > create index test_index on test (name); > > set cpu_tuple_cost=1; # force backend to use index > > scanning > > select * from test where name like 'á%'; > > > BUG: Only 1 line is selected with 'á' only instead of both lines. > > The problem here is that given the search pattern '\341%', the planner > generates index limit conditions > name >= '\341' AND name < '\342'; > > Apparently, in CZ locale it is true that '\341' is less than '\342', > but it does not follow from that that all strings starting with '\341' > are less than '\342'. In fact '\341b' is considered greater than '\342'. > Hm. The character that follows 0xe1 in iso-8859-2 order is "a + circumflex" (Oxe2) which is - as far as I know - not part of the Czech alphapet. The successors of 0xe1 in Czech collation order (code points from iso-8859-2) are 0x042 (capital B) and 0x062 (small B). => name >= '0xe1' AND (name < '0x062' OR name < '0x042') provided comparision is done by strcoll(). Another interresting feature of Czech collation is: H < "CH" < I and: B < C < C + CARON < D .. < H < "CH" < I So what happens with "WHERE name like 'Czec%`" ? Regards Erich
В списке pgsql-hackers по дате отправления: