Re: planner chooses incremental but not the best one
От | Tomas Vondra |
---|---|
Тема | Re: planner chooses incremental but not the best one |
Дата | |
Msg-id | 66431c21-5172-4e17-9422-8ec8b97a4efd@enterprisedb.com обсуждение исходный текст |
Ответ на | Re: planner chooses incremental but not the best one (Andrei Lepikhov <a.lepikhov@postgrespro.ru>) |
Ответы |
Re: planner chooses incremental but not the best one
|
Список | pgsql-hackers |
On 2/15/24 07:50, Andrei Lepikhov wrote: > On 18/12/2023 19:53, Tomas Vondra wrote: >> On 12/18/23 11:40, Richard Guo wrote: >> The challenge is where to get usable information about correlation >> between columns. I only have a couple very rought ideas of what might >> try. For example, if we have multi-column ndistinct statistics, we might >> look at ndistinct(b,c) and ndistinct(b,c,d) and deduce something from >> >> ndistinct(b,c,d) / ndistinct(b,c) >> >> If we know how many distinct values we have for the predicate column, we >> could then estimate the number of groups. I mean, we know that for the >> restriction "WHERE b = 3" we only have 1 distinct value, so we could >> estimate the number of groups as >> >> 1 * ndistinct(b,c) > Did you mean here ndistinct(c,d) and the formula: > ndistinct(b,c,d) / ndistinct(c,d) ? Yes, I think that's probably a more correct ... Essentially, the idea is to estimate the change in number of distinct groups after adding a column (or restricting it in some way). > > Do you implicitly bear in mind here the necessity of tracking clauses > that were applied to the data up to the moment of grouping? > I don't recall what exactly I considered two months ago when writing the message, but I don't see why we would need to track that beyond what we already have. Shouldn't it be enough for the grouping to simply inspect the conditions on the lower levels? regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления: