Re: [HACKERS] PATCH: multivariate histograms and MCV lists

Поиск

Список

Период

Сортировка

От	Dean Rasheed
Тема	Re: [HACKERS] PATCH: multivariate histograms and MCV lists
Дата	16 июля 2018 г. 15:54:04
Msg-id	CAEZATCWORz=bXEPVJNxK49Ws9zeBHqsEdd3JsrrZWMvQ8oXuSA@mail.gmail.com обсуждение исходный текст
Ответ на	Re: [HACKERS] PATCH: multivariate histograms and MCV lists (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Ответы	Re: [HACKERS] PATCH: multivariate histograms and MCV lists
Список	pgsql-hackers

Дерево обсуждения

On 16 July 2018 at 13:23, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>>> The top-level clauses allow us to make such deductions, with deeper
>>> clauses it's much more difficult (perhaps impossible). Because for
>>> example with (a=1 AND b=1) there can be just a single match, so if we
>>> find it in MCV we're done. With clauses like ((a=1 OR a=2) AND (b=1 OR
>>> b=2)) it's not that simple, because there may be multiple combinations
>>> and so a match in MCV does not guarantee anything.
>>
>> Actually, it guarantees a lower bound on the overall selectivity, and
>> maybe that's the best that we can do in the absence of any other
>> stats.
>>
> Hmmm, is that actually true? Let's consider a simple example, with two
> columns, each with just 2 values, and a "perfect" MCV list:
>
>     a | b | frequency
>    -------------------
>     1 | 1 | 0.5
>     2 | 2 | 0.5
>
> And let's estimate sel(a=1 & b=2).

OK.In this case, there are no MCV matches, so there is no lower bound (it's 0).

What we could do though is also impose an upper bound, based on the
sum of non-matching MCV frequencies. In this case, the upper bound is
also 0, so we could actually say the resulting selectivity is 0.

Regards,
Dean

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: [HACKERS] PATCH: multivariate histograms and MCV lists