Re: Sigh, LIKE indexing is *still* broken in foreign locales
От | Giles Lean |
---|---|
Тема | Re: Sigh, LIKE indexing is *still* broken in foreign locales |
Дата | |
Msg-id | 2958.960446519@nemeton.com.au обсуждение исходный текст |
Ответ на | Sigh, LIKE indexing is *still* broken in foreign locales (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: Sigh, LIKE indexing is *still* broken in foreign locales
|
Список | pgsql-hackers |
On Wed, 07 Jun 2000 22:22:06 -0400 Tom Lane wrote: > Since '\341' and '\342' are two different accented forms of 'a' > (if I'm looking at the right character set), this is perhaps not so > improbable as all that. Evidently the collation rule is that different > accent forms sort the same unless the strings would otherwise be > considered equal, in which case an ordering is assigned to them. I thought that was common, but while I've worked on internationalisation issues sometimes I'm no linguist. > So, the rule we thought we had for generating index bounds falls flat, > and we're back to the same old question: given a proposed prefix string, > how can we generate bounds that are certain to be considered <= and >= > all strings starting with that prefix? To confess ignorance, why does PostgreSQL need to generate such bounds? Complete string comparisons with a locale aware function such as strcoll() are safe. Using less than a full string is tricky indeed, and I'm not sure is possible in general although it might be. Other problematic cases are likely to include one-to-two collations (� in German, for example) and two-to-one collations (the reverse, but I've forgotten my example. Anyone?) Then there are wide characters, including some encodings that are stateful. Regards, Giles
В списке pgsql-hackers по дате отправления: