Re: [HACKERS] PATCH: multivariate histograms and MCV lists
От | Dean Rasheed |
---|---|
Тема | Re: [HACKERS] PATCH: multivariate histograms and MCV lists |
Дата | |
Msg-id | CAEZATCWORz=bXEPVJNxK49Ws9zeBHqsEdd3JsrrZWMvQ8oXuSA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [HACKERS] PATCH: multivariate histograms and MCV lists (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
Ответы |
Re: [HACKERS] PATCH: multivariate histograms and MCV lists
|
Список | pgsql-hackers |
On 16 July 2018 at 13:23, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: >>> The top-level clauses allow us to make such deductions, with deeper >>> clauses it's much more difficult (perhaps impossible). Because for >>> example with (a=1 AND b=1) there can be just a single match, so if we >>> find it in MCV we're done. With clauses like ((a=1 OR a=2) AND (b=1 OR >>> b=2)) it's not that simple, because there may be multiple combinations >>> and so a match in MCV does not guarantee anything. >> >> Actually, it guarantees a lower bound on the overall selectivity, and >> maybe that's the best that we can do in the absence of any other >> stats. >> > Hmmm, is that actually true? Let's consider a simple example, with two > columns, each with just 2 values, and a "perfect" MCV list: > > a | b | frequency > ------------------- > 1 | 1 | 0.5 > 2 | 2 | 0.5 > > And let's estimate sel(a=1 & b=2). OK.In this case, there are no MCV matches, so there is no lower bound (it's 0). What we could do though is also impose an upper bound, based on the sum of non-matching MCV frequencies. In this case, the upper bound is also 0, so we could actually say the resulting selectivity is 0. Regards, Dean
В списке pgsql-hackers по дате отправления: