Re: ANALYZE sampling is too good
От | Albe Laurenz |
---|---|
Тема | Re: ANALYZE sampling is too good |
Дата | |
Msg-id | A737B7A37273E048B164557ADEF4A58B17C7DCEF@ntex2010i.host.magwien.gv.at обсуждение исходный текст |
Ответ на | Re: ANALYZE sampling is too good (Greg Stark <stark@mit.edu>) |
Ответы |
Re: ANALYZE sampling is too good
|
Список | pgsql-hackers |
Greg Stark wrote: >> It's also applicable for the other stats; histogram buckets constructed >> from a 5% sample are more likely to be accurate than those constructed >> from a 0.1% sample. Same with nullfrac. The degree of improved >> accuracy, would, of course, require some math to determine. > > This "some math" is straightforward basic statistics. The 95th > percentile confidence interval for a sample consisting of 300 samples > from a population of a 1 million would be 5.66%. A sample consisting > of 1000 samples would have a 95th percentile confidence interval of > +/- 3.1%. Doesn't all that assume a normally distributed random variable? I don't think it can be applied to database table contents without further analysis. Yours, Laurenz Albe
В списке pgsql-hackers по дате отправления: