Re: Sort and index
От | Jim C. Nasby |
---|---|
Тема | Re: Sort and index |
Дата | |
Msg-id | 20050511211516.GX31103@decibel.org обсуждение исходный текст |
Ответ на | Re: Sort and index (Manfred Koizar <mkoi-pg@aon.at>) |
Ответы |
Re: Sort and index
|
Список | pgsql-performance |
First, I've got some updated numbers up at http://stats.distributed.net/~decibel/ timing2.log shows that the planner actually under-estimates an index scan by several orders of magnitude. Granted, random_page_cost is set to an unrealistic 1.1 (otherwise I can't force the index scan), but that alone isn't enough to explain the difference. On Wed, May 11, 2005 at 05:59:10PM +0200, Manfred Koizar wrote: > On Sun, 24 Apr 2005 17:01:46 -0500, "Jim C. Nasby" <decibel@decibel.org> > wrote: > >> >> Feel free to propose better cost equations. > > I did. More than once. > > >estimated index scan cost for (project_id, id, date) is > >0.00..100117429.34 while the estimate for work_units is > >0.00..103168408.62; almost no difference, > > ~3% > > > even though project_id correlation is .657 > > This is divided by the number of index columns, so the index correlation > is estimated to be 0.219. That seems like a pretty bad assumption to make. Is there any eta on having statistics for multi-column indexes? > >you'll see that the cost of the index scan is way overestimated. Looking > >at the code, the runcost is calculated as > > > > run_cost += max_IO_cost + csquared * (min_IO_cost - max_IO_cost); > > > >where csquared is indexCorrelation^2. Why is indexCorrelation squared? > >The comments say a linear interpolation between min_IO and max_IO is > >used, but ISTM that if it was linear then instead of csquared, > >indexCorrelation would just be used. > > In my tests I got much more plausible results with > > 1 - (1 - abs(correlation))^2 What's the theory behind that? And I'd still like to know why correlation squared is used. > Jim, are you willing to experiment with one or two small patches of > mine? What version of Postgres are you running? It depends on the patches, since this is a production machine. Currently it's running 7.4.*mumble*, though I need to upgrade to 8, which I was intending to do via slony. Perhaps the best thing would be for me to get that setup and we can experiment against version 8.0.3. -- Jim C. Nasby, Database Consultant decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?"
В списке pgsql-performance по дате отправления: