Re: Odd statistics behaviour in 7.2
От | Tom Lane |
---|---|
Тема | Re: Odd statistics behaviour in 7.2 |
Дата | |
Msg-id | 20368.1013882239@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: Odd statistics behaviour in 7.2 ("Gordon A. Runkle" <gar@integrated-dynamics.com>) |
Список | pgsql-hackers |
BTW, while we're thinking about this, there's another aspect of the number-of-distinct-values estimator that could use some peer review. That's the decision whether to assume that the number of distinct values in a column is fixed, or will vary with the size of the table. (For example, in a boolean column, ndistinct should clearly be 2 no matter how large the table gets; but in any unique column ndistinct should equal the table size.) This is important since there are times when we update the table size estimate (pg_class.reltuples) without recomputing the statistics in pg_statistic. The "negative stadistinct" convention in pg_statistic is used to signal which case ANALYZE thinks applies. Presently the decision is pretty simplistic: if the estimated number of distinct values is more than 10% of the number of rows, then assume the number of distinct values scales with the number of rows. I believe that some rule of this form is reasonable, but the 10% threshold was just picked out of the air. Can anyone suggest an argument in favor of some other value, or a better way to look at it? regards, tom lane
В списке pgsql-hackers по дате отправления: