Re: Optimizer improvements: to do or not to do?
От | Gregory Stark |
---|---|
Тема | Re: Optimizer improvements: to do or not to do? |
Дата | |
Msg-id | 877j07txmo.fsf@enterprisedb.com обсуждение исходный текст |
Ответ на | Re: Optimizer improvements: to do or not to do? (Ron Mayer <rm_pg@cheapcomplexdevices.com>) |
Ответы |
Re: Optimizer improvements: to do or not to do?
Re: Optimizer improvements: to do or not to do? |
Список | pgsql-hackers |
Ron Mayer <rm_pg@cheapcomplexdevices.com> writes: > It's common here for queries to vastly overestimate the > number of pages that would need to be read because > postgresql's guess at the correlation being practically 0 > despite the fact that the distinct values for any given > column are closely packed on a few pages. I think we need a serious statistics jock to pipe up with some standard metrics that do what we need. Otherwise we'll never have a solid footing for the predictions we make and will never know how much we can trust them. That said I'm now going to do exactly what I just said we should stop doing and brain storm about an ad-hoc metric that might help: I wonder if what we need is something like: sort the sampled values by value and count up the average number of distinct blocks per value. That might let us predict how many pages a fetch of a specific value would retrieve. Or perhaps we need a second histogram where the quantities are of distinct pages rather than total records. We might also need a separate "average number of n-block spans per value" metric to predict how sequential the i/o will be in addition to how many pages will be fetched. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com
В списке pgsql-hackers по дате отправления: