Re: Are statistics gathered on function indexes?
От | Ray Ontko |
---|---|
Тема | Re: Are statistics gathered on function indexes? |
Дата | |
Msg-id | 200206281240.HAA03045@shire.ontko.com обсуждение исходный текст |
Ответ на | Re: Are statistics gathered on function indexes? (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: Are statistics gathered on function indexes?
|
Список | pgsql-admin |
Tom, > > In other words, if I only give one letter, then I might expect > > to get about 1/10 of the table, and a full scan might make sense. > > But the cost should continue to decline as I give longer and longer > > strings, up to the length of the field. > > Would this be a reasonable improvement to the optimizer? > > It's there already; what did you think was making the difference > between W% and WI% ? Yes, but the cost doesn't continue to decline if I make the LIKE more and more restrictive by going from WI% to WIL% to WILL%, etc. The current approach assumes, perhaps correctly, that with only one letter, you might as well do a full table scan, but with 2 or more letters, you might as well use an index and make a reasonable guess at the cost. The limitation with this approach is that the optimizer is considering a number of different options if the join includes a number of tables and constraints. My query suffers from a one-size-fits-all approach. If the optimizer had a better guess on cost, it could choose to drive my query using this index instead of another. By allowing the cost to decline as the length of the string increases, we're making the bet that longer strings are more selective and require fewer random pages to be read, among other improvements. Note that this would behave badly in situations where all the values in the index begin with the same 10 characters and the LIKE string is less than 10 characters long. This is already a problem for the current approach, and I think could only be solved by adding statistics for the selectivity of increasingly longer strings. BTW, I think that this discussion probably belongs in a different list (e.g., the one for hacking the optimizer). Since I'm not (yet) planning to jump in to the code, my purpose for raising the question here is to help me (and others on the list) understand the capabilities and limitations of indexes and the optimizer so that we can make better use of what we have currently, and help identify areas for improvement. Ray ---------------------------------------------------------------------- Ray Ontko rayo@ontko.com Phone 1.765.935.4283 Fax 1.765.962.9788 Ray Ontko & Co. Software Consulting Services http://www.ontko.com/
В списке pgsql-admin по дате отправления: