Re: query slows down with more accurate stats
От | Tom Lane |
---|---|
Тема | Re: query slows down with more accurate stats |
Дата | |
Msg-id | 25556.1081883922@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | query slows down with more accurate stats (Robert Treat <xzilla@users.sourceforge.net>) |
Ответы |
Re: query slows down with more accurate stats
Re: query slows down with more accurate stats |
Список | pgsql-performance |
Robert Treat <xzilla@users.sourceforge.net> writes: > live=# analyze cl; > ANALYZE > live=# select reltuples from pg_class where relname = 'cl'; > reltuples > ----------- > 53580 > (1 row) > live=# vacuum cl; > VACUUM > live=# select reltuples from pg_class where relname = 'cl'; > reltuples > ------------- > 1.14017e+06 > (1 row) Well, the first problem is why is ANALYZE's estimate of the total row count so bad :-( ? I suspect you are running into the situation where the initial pages of the table are thinly populated and ANALYZE mistakenly assumes the rest are too. Manfred is working on a revised sampling method for ANALYZE that should fix this problem in 7.5 and beyond, but for now it seems like a VACUUM FULL might be in order. > so i guess i am wondering if there is something I should be doing to > help get the better plan at the more accurate stats levels and/or why it > doesn't stick with the original plan (I noticed disabling merge joins > does seem to push it back to the original plan). With the larger number of estimated rows it's figuring the nestloop will be too expensive. The row estimate for the cl scan went up from 1248 to 10546, so the estimated cost for the nestloop plan would go to about 240000 units vs 80000 for the mergejoin plan. This is obviously off rather badly when the true runtimes are 1.7 vs 8.1 seconds :-(. I think this is an example of a case where we really need better estimation of nestloop costs --- it's drastically overestimating the relative cost of the nestloop because it's not accounting for the cache benefits of the repeated index searches. You could probably force the nestloop to be chosen by lowering random_page_cost, but that's just a kluge solution ... the real problem is the model is wrong. I have a to-do item to work on this, and will try to bump up its priority a bit. regards, tom lane
В списке pgsql-performance по дате отправления: