Re: planner chooses incremental but not the best one
От | Andrei Lepikhov |
---|---|
Тема | Re: planner chooses incremental but not the best one |
Дата | |
Msg-id | 96501eed-b7a0-4d15-841c-662d81ea5f95@postgrespro.ru обсуждение исходный текст |
Ответ на | Re: planner chooses incremental but not the best one (Tomas Vondra <tomas.vondra@enterprisedb.com>) |
Ответы |
Re: planner chooses incremental but not the best one
|
Список | pgsql-hackers |
On 15/2/2024 18:10, Tomas Vondra wrote: > > > On 2/15/24 07:50, Andrei Lepikhov wrote: >> On 18/12/2023 19:53, Tomas Vondra wrote: >>> On 12/18/23 11:40, Richard Guo wrote: >>> The challenge is where to get usable information about correlation >>> between columns. I only have a couple very rought ideas of what might >>> try. For example, if we have multi-column ndistinct statistics, we might >>> look at ndistinct(b,c) and ndistinct(b,c,d) and deduce something from >>> >>> ndistinct(b,c,d) / ndistinct(b,c) >>> >>> If we know how many distinct values we have for the predicate column, we >>> could then estimate the number of groups. I mean, we know that for the >>> restriction "WHERE b = 3" we only have 1 distinct value, so we could >>> estimate the number of groups as >>> >>> 1 * ndistinct(b,c) >> Did you mean here ndistinct(c,d) and the formula: >> ndistinct(b,c,d) / ndistinct(c,d) ? > > Yes, I think that's probably a more correct ... Essentially, the idea is > to estimate the change in number of distinct groups after adding a > column (or restricting it in some way). Thanks, I got it. I just think how to implement such techniques with extensions just to test the idea in action. In the case of GROUP-BY we can use path hook, of course. But what if to invent a hook on clauselist estimation? >> Do you implicitly bear in mind here the necessity of tracking clauses >> that were applied to the data up to the moment of grouping? >> > > I don't recall what exactly I considered two months ago when writing the > message, but I don't see why we would need to track that beyond what we > already have. Shouldn't it be enough for the grouping to simply inspect > the conditions on the lower levels? Yes, exactly. I've thought about looking into baserestrictinfos and, if group-by references a subquery targetlist, into subqueries too. -- regards, Andrei Lepikhov Postgres Professional
В списке pgsql-hackers по дате отправления: