Re: RFC: planner statistics in 7.2
От | Tom Lane |
---|---|
Тема | Re: RFC: planner statistics in 7.2 |
Дата | |
Msg-id | 12651.988035033@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: RFC: planner statistics in 7.2 (Philip Warner <pjw@rhyme.com.au>) |
Ответы |
Re: RFC: planner statistics in 7.2
|
Список | pgsql-hackers |
Philip Warner <pjw@rhyme.com.au> writes: >> All that we're discussing here is one specific parameter in the cost >> estimation for an indexscan, viz, the extent to which the table ordering >> agrees with the index ordering. > This does not necessarily follow. A table ordering need not follow the sort > order of an index for the index to have a low indexscan cost. All that is > required is that most of the rows referred to by an index node must reside > in a page or pages that will be read by one IO. eg. a table that has a > sequence based ID, with, say 20% of rows updated, will work nicely with an > indexscan on the ID, even though it has never been clustered. Right, what matters is the extent of correlation between table ordering and index ordering, not how it got to be that way. > What I'm suggesting is that if you look at a random sample of index nodes, > you should be able to get a statistically valid estimate of the 'clumping' > of the data pointed to by the index. And I'm saying that you don't actually have to look at the index in order to compute the very same estimate. The only property of the index that matters is its sort order; if you assume you know the right sort order (and in practice there's usually only one interesting possibility for a column) then you can compute the correlation just by looking at the table. Andreas correctly points out that this approach doesn't extend very well to multi-column or functional indexes, but I'm willing to punt on those for the time being ... regards, tom lane
В списке pgsql-hackers по дате отправления: