question about index cost estimates
От | Jeff Hoffmann |
---|---|
Тема | question about index cost estimates |
Дата | |
Msg-id | 3922F6C6.63483AF1@propertykey.com обсуждение исходный текст |
Ответы |
Re: question about index cost estimates
RE: question about index cost estimates |
Список | pgsql-hackers |
i know tom lane is probably sick of me, but i'm trying to figure out how the cost estimators work for an index scan. i'm logging a lot of the calculations that go into the cost estimate for some sample queries to see what factors are most important in a cost estimate. it seems to me that in the queries that i'm running, a vast majority of the cost comes from reading the tuples from the relation. i think the cost of nonsequential access seems reasonable (it's about 2), but the calculation of pages fetched from the relation seems high. i'm not sure what the actual should be, but some queries have it at 2-4x the number of pages in the relation, which seemed high, so i started looking at that. i don't understand why the function that is used would be a good model of what is actually happening. here's the comment & the function: * Estimate number of main-table tuples and pages fetched.** If the number of tuples is much smaller than the number of pagesin* the relation, each tuple will cost a separate nonsequential fetch.* If it is comparable or larger, then probablywe will be able to* avoid some fetches. We use a growth rate of log(#tuples/#pages +* 1) --- probably totally bogus,but intuitively it gives the right* shape of curve at least. pages_fetched = ceil(baserel->pages * log(tuples_fetched / baserel->pages + 1.0)); i'm at a total loss to explain how this works. for all i know, it's correct and it is that costly, i don't know. it just seems to me that there should be some notion of tuple size figured in to know how many tuples fit in a page. can somebody explain it to me? thanks, jeff
В списке pgsql-hackers по дате отправления: