Re: Improving N-Distinct estimation by ANALYZE
От | Greg Stark |
---|---|
Тема | Re: Improving N-Distinct estimation by ANALYZE |
Дата | |
Msg-id | 87irsy4t1o.fsf@stark.xeocode.com обсуждение исходный текст |
Ответ на | Re: Improving N-Distinct estimation by ANALYZE (Josh Berkus <josh@agliodbs.com>) |
Ответы |
Re: Improving N-Distinct estimation by ANALYZE
|
Список | pgsql-hackers |
Josh Berkus <josh@agliodbs.com> writes: > > Only if your sample is random and independent. The existing mechanism tries > > fairly hard to ensure that every record has an equal chance of being selected. > > If you read the entire block and not appropriate samples then you'll introduce > > systematic sampling errors. For example, if you read an entire block you'll be > > biasing towards smaller records. > > Did you read any of the papers on block-based sampling? These sorts of issues > are specifically addressed in the algorithms. We *currently* use a block based sampling algorithm that addresses this issue by taking care to select rows within the selected blocks in an unbiased way. You were proposing reading *all* the records from the selected blocks, which throws away that feature. -- greg
В списке pgsql-hackers по дате отправления: