Re: [HACKERS] another locale problem
От | Daniel Kalchev |
---|---|
Тема | Re: [HACKERS] another locale problem |
Дата | |
Msg-id | 199906111638.TAA10681@dcave.digsys.bg обсуждение исходный текст |
Ответ на | Re: [HACKERS] another locale problem (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: [HACKERS] another locale problem
|
Список | pgsql-hackers |
>>>Tom Lane said:> Daniel Kalchev <daniel@digsys.bg> writes:> > To summarize the problem. If key contains (equivalent cyrillic>> letters) 'ABC', 'ABCD', 'DAB' and 'ABX' and the query is:> > > SELECT key FROM t WHERE key ~* '^AB';> > > indexscan will be used and the correct tuples ('ABC', 'ABCD' and> > 'ABX') will be returned. If the query is> > > SELECTkey FROM t WHERE key ~* '^ab';> > > index scan will be used and no tuples will be returned.> > Hm. Is it possiblethat isalpha() is doing the wrong thing on your> machine? makeIndexable() currently assumes that isalpha() returnstrue> for any character that is subject to case conversion, but I wonder> whether that's a good enough test. In fact, after giving it some though... the expression in gram.y (strcmp(opname,"~*") == 0 && isalpha(n->val.val.str[pos]))) is wrong. The statement in my view decides that a regular expression is not indexable if it contains special characters or if it contains non-alpha characters. Therefore, the statement should be written as: (strcmp(opname,"~*") == 0 && !isalpha((unsigned char)n->val.val.str[pos]))) (two fixes :) This makes indexes work for '^abc' (lowercase ASCII). But does not find anything, which means regex does not work. It does not work for both ASCII and non-ASCII text/patterns. :-( > The other possibility is that regexp's internal handling of> case-insensitive matching is not right. I believe it to be terribly wrong, and some releases ago it worked with 8-bit characters by just compiling it with -funsigned-char. Now this breaks things... Daniel
В списке pgsql-hackers по дате отправления: