Re: Why won't it index scan?
От | Peter Kovacs |
---|---|
Тема | Re: Why won't it index scan? |
Дата | |
Msg-id | 446C19A7.5000104@chemaxon.hu обсуждение исходный текст |
Ответ на | Re: Why won't it index scan? (Greg Stark <gsstark@mit.edu>) |
Ответы |
Re: Why won't it index scan?
|
Список | pgsql-general |
Sorry for the naive question, but: is there a problem with analyze doing full table scans? Analyze will not lock anything, will it? Peter Greg Stark wrote: > Tom Lane <tgl@sss.pgh.pa.us> writes: > > >> "Ed L." <pgsql@bluepolka.net> writes: >> >>> So, does this sound like we just happened to get repeatedly >>> horribly unrepresentative random samples with stats target at >>> 10? Are we at the mercy of randomness here? Or is there a >>> better preventive procedure we can follow to systematically >>> identify this kind of situation? >>> >> I think the real issue is that stats target 10 is too small for large >> tables: the samples are just not large enough to support a decent >> numdistinct estimate, which is the critical stat for cases such as this >> (ie, estimating the number of hits on a value that's not in the >> most-common-values list). >> > > There's been some discussion on -hackers about this area. Sadly the idea of > using samples to calculate numdistinct estimates is fundamentally on pretty > shaky ground. > > Whereas a fixed sample size works fine for calculating distribution of values, > in order to generate consistent precision for numdistinct estimates the > samples will have to be a constant fraction of the table -- and unfortunately > a pretty large fraction at that. > > So sadly I think "at the mercy of randomness" is pretty accurate. You'll have > to raise the statistics target as the table grows and I expect you'll > eventually run into some downsides of large stats targets. > > Some better algorithms were posted, but they would require full table scans > during analyze, not just samples. > >
В списке pgsql-general по дате отправления: