Re: [PERFORM] Bad n_distinct estimation; hacks suggested?
От | Markus Schaber |
---|---|
Тема | Re: [PERFORM] Bad n_distinct estimation; hacks suggested? |
Дата | |
Msg-id | 4277774F.7040205@logix-tt.com обсуждение исходный текст |
Ответ на | Re: [PERFORM] Bad n_distinct estimation; hacks suggested? (Josh Berkus <josh@agliodbs.com>) |
Ответы |
Re: [PERFORM] Bad n_distinct estimation; hacks suggested?
|
Список | pgsql-hackers |
Hi, Josh, Josh Berkus wrote: > Yes, actually. We need 3 different estimation methods: > 1 for tables where we can sample a large % of pages (say, >= 0.1) > 1 for tables where we sample a small % of pages but are "easily estimated" > 1 for tables which are not easily estimated by we can't afford to sample a > large % of pages. > > If we're doing sampling-based estimation, I really don't want people to lose > sight of the fact that page-based random sampling is much less expensive than > row-based random sampling. We should really be focusing on methods which > are page-based. Would it make sense to have a sample method that scans indices? I think that, at least for tree based indices (btree, gist), rather good estimates could be derived. And the presence of a unique index should lead to 100% distinct values estimation without any scan at all. Markus
В списке pgsql-hackers по дате отправления: