Re: LIKE search and performance
От | Richard Huxton |
---|---|
Тема | Re: LIKE search and performance |
Дата | |
Msg-id | 4657273A.3080005@archonet.com обсуждение исходный текст |
Ответ на | Re: LIKE search and performance (Gregory Stark <stark@enterprisedb.com>) |
Список | pgsql-performance |
Gregory Stark wrote: > "Richard Huxton" <dev@archonet.com> writes: > >> Now you and I can look at a substring and probably make a good guess how common >> it is (assuming we know the targets are British surnames or Japanese towns). PG >> needs one number - or rather, it picks one number for each length of >> search-string (afaik). > > I don't think that's true. Postgres calculates the lower and upper bound > implied by the search pattern and then uses the histogram to estimate how > selective that range is. It's sometimes surprisingly good but obviously it's > not perfect. Sorry - I'm obviously picking my words badly today. I meant for the "contains" substring match. It gives different (goes away and checks...yes) predictions based on string length. So it guesses that LIKE '%aaa%' will match more than LIKE '%aaaa%'. Of course, if we were matching surnames you and I could say that this is very unlikely, but without some big statistics table I guess there's not much more PG can do. For a trailing wildcard LIKE 'aaa%' it can and does as you say convert this into something along the lines of (>= 'aaa' AND < 'aab'). Although IIRC that depends if your locale allows such (not sure, I don't really use non-C/non-English locales enough). -- Richard Huxton Archonet Ltd
В списке pgsql-performance по дате отправления: