Re: Improving N-Distinct estimation by ANALYZE

Поиск

Список

Период

Сортировка

От	Greg Stark
Тема	Re: Improving N-Distinct estimation by ANALYZE
Дата	5 января 2006 г. 11:02:18
Msg-id	87irsy4t1o.fsf@stark.xeocode.com обсуждение исходный текст
Ответ на	Re: Improving N-Distinct estimation by ANALYZE (Josh Berkus <josh@agliodbs.com>)
Ответы	Re: Improving N-Distinct estimation by ANALYZE
Список	pgsql-hackers

Дерево обсуждения

Josh Berkus <josh@agliodbs.com> writes:

> > Only if your sample is random and independent. The existing mechanism tries
> > fairly hard to ensure that every record has an equal chance of being selected.
> > If you read the entire block and not appropriate samples then you'll introduce
> > systematic sampling errors. For example, if you read an entire block you'll be
> > biasing towards smaller records.
> 
> Did you read any of the papers on block-based sampling?   These sorts of issues
> are specifically addressed in the algorithms.

We *currently* use a block based sampling algorithm that addresses this issue
by taking care to select rows within the selected blocks in an unbiased way.
You were proposing reading *all* the records from the selected blocks, which
throws away that feature.

-- 
greg

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Improving N-Distinct estimation by ANALYZE