Re: strange row count estimates with conditions on multiple column

Поиск

Список

Период

Сортировка

От	Tom Lane
Тема	Re: strange row count estimates with conditions on multiple column
Дата	17 ноября 2010 г. 01:58:50
Msg-id	16724.1289973519@sss.pgh.pa.us обсуждение исходный текст
Ответ на	Re: strange row count estimates with conditions on multiple column (Tomas Vondra <tv@fuzzy.cz>)
Ответы	Re: strange row count estimates with conditions on multiple column
Список	pgsql-general

Дерево обсуждения

Tomas Vondra <tv@fuzzy.cz> writes:
> Yes, I understand why MCV is not used in case of col_b, and I do
> understand that the estimate may not be precise. But I'm wondering
> what's a better estimate in such cases - 1, 5000, any constant, or
> something related to a the histogram?

It is doing it off the histogram.  The logic is actually quite good
I think for cases where the data granularity is small compared to the
histogram bucket width.  For cases like we have here, the assumption
of a continuous distribution fails rather badly --- but it's pretty
hard to see how to improve it without inserting a lot of type-specific
assumptions.

> BTW I think the default estimate used to be 1000, so it was changed in
> one of the 8.x releases? Can you point me to the docs? I've even tried
> to find that in the sources, but unsuccessfully.

It's DEFAULT_RANGE_INEQ_SEL, and AFAIR it hasn't changed in quite a while.
But I wouldn't be surprised if the behavior of this example changed when
we boosted the default statistics target.

            regards, tom lane

В списке pgsql-general по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: strange row count estimates with conditions on multiple column