Re: Simple postgresql.conf wizard
От | Gregory Stark |
---|---|
Тема | Re: Simple postgresql.conf wizard |
Дата | |
Msg-id | 87wseqppji.fsf@oxford.xeocode.com обсуждение исходный текст |
Ответ на | Re: Simple postgresql.conf wizard (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: Simple postgresql.conf wizard
|
Список | pgsql-hackers |
Tom Lane <tgl@sss.pgh.pa.us> writes: > "Dann Corbit" <DCorbit@connx.com> writes: >> I also do not believe that there is any value that will be the right >> answer. But a table of data might be useful both for people who want to >> toy with altering the values and also for those who want to set the >> defaults. I guess that at one time such a table was generated to >> produce the initial estimates for default values. > > Sir, you credit us too much :-(. The actual story is that the current > default of 10 was put in when we first implemented stats histograms, > replacing code that kept track of only a *single* most common value > (and not very well, at that). So it was already a factor of 10 more > stats than we had experience with keeping, and accordingly conservatism > suggested not boosting the default much past that. I think that's actually too little credit. The sample size is chosen quite carefully based on solid mathematics to provide a specific confidence interval estimate for queries covering ranges the size of a whole bucket. The actual number of buckets more of an arbitrary choice. It depends entirely on how your data is distributed and how large a range your queries are covering. A uniformly distributed data set should only need a single bucket to generate good estimates. Less evenly distributed data sets need more. I wonder actually if there are algorithms for estimating the number of buckets needed for a histogram to achieve some measurable goal. That would close the loop. It would be much more reassuring to base the size of the sample on solid statistics than on hunches. > So we really don't have any methodically-gathered evidence about the > effects of different stats settings. It wouldn't take a lot to convince > us to switch to a different default, I think, but it would be nice to > have more than none. I think the difficulty (aside from testing being laborious at the best of times) is that it's heavily dependent on data sets which are hard to generate good examples for. Offhand I would think the census data might make a good starting point -- it should have columns which range from perfectly uniform to highly skewed. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production Tuning
В списке pgsql-hackers по дате отправления: