Re: 7.3 no longer using indexes for LIKE queries
От | Matthew Gabeler-Lee |
---|---|
Тема | Re: 7.3 no longer using indexes for LIKE queries |
Дата | |
Msg-id | ABABFB80F35AD311848B0090279918EF010B9B66@ZYCOSNT2.hq.zycos.com обсуждение исходный текст |
Ответ на | 7.3 no longer using indexes for LIKE queries (Matthew Gabeler-Lee <mgabelerlee@zycos.com>) |
Список | pgsql-general |
What I was referring to there was the idea that when a user searched for 'cote', you'd transform that into a condition along the lines of "WHERE foldaccents(textcol) LIKE 'cote%'" or such so that failing to enter accents wouldn't make the query not hit, which is especially desirable for someone who wants to query something with accents, but can't type them easily on their keyboard. In the case of optimizing the LIKE operator, if you ask for txtcol LIKE 'CÔTE%', then (assuming the charset is 'sane'), it would do an index scan with a filter something along the lines of (txtcol >= 'CÔTE' AND txtcol <'CÔTF') before applying the LIKE operator. Assuming the locale is set right in the database, the indexing stuff will know how to colate CÔTE vs. COTÉ; that's the whole point of setting the COLLATE thing, so that the index *will* have them in the correct order. My point was mostly that expecting txtcol LIKE 'CÔTE%' to match 'COTÉ' is probably bad logic. Now, if there is a good argument why that isn't so, I'm all ears, but it seems to me that 'CÔTE' and 'COTÉ' are different words and thus shouldn't match. -Matt -----Original Message----- From: Jean-Luc Lachance [mailto:jllachan@nsd.ca] Sent: Wednesday, December 04, 2002 15:25 To: Matthew Gabeler-Lee Cc: 'pgsql-general@postgresql.org' Subject: Re: [GENERAL] 7.3 no longer using indexes for LIKE queries C_COLLATE is what is involved with accents. How would you sort: CÔTE CÔTES COTÉ COTÉS COTE You can't fold accented character into non accented because unique index would barf on CÔTE if COTE is already in. You still need to know if 'CÔTE' < 'COTÉ' or not when you do a sort. Collating in french, for example, is not a byte to byte compare. If you compare words based only on their binary representation, the sort will be wrong CRIME before CÔTE. JLL Matthew Gabeler-Lee wrote: > 4) accent folding; I'm not entirely sure like is supposed to do this. I'm > going to pretend for the rest of this that the like operator shouldn't fold > accented characters. > > [...] > > It seems to me that the most common place one wants to think about this is > in full text searching and the like. In this case, maybe I'm daft, but > perhaps the thing to do is to create a functional index, where the function > being indexed strips all the accents off characters. > > Does the SQL spec have anything to say on accent folding in comparisons?
В списке pgsql-general по дате отправления: