Re: planner chooses incremental but not the best one
| От | Tomas Vondra |
|---|---|
| Тема | Re: planner chooses incremental but not the best one |
| Дата | |
| Msg-id | 575d0bca-d235-4bb1-8901-3866e5f3c250@enterprisedb.com обсуждение исходный текст |
| Ответ на | Re: planner chooses incremental but not the best one (Andrei Lepikhov <a.lepikhov@postgrespro.ru>) |
| Список | pgsql-hackers |
On 2/15/24 13:45, Andrei Lepikhov wrote: > On 15/2/2024 18:10, Tomas Vondra wrote: >> >> >> On 2/15/24 07:50, Andrei Lepikhov wrote: >>> On 18/12/2023 19:53, Tomas Vondra wrote: >>>> On 12/18/23 11:40, Richard Guo wrote: >>>> The challenge is where to get usable information about correlation >>>> between columns. I only have a couple very rought ideas of what might >>>> try. For example, if we have multi-column ndistinct statistics, we >>>> might >>>> look at ndistinct(b,c) and ndistinct(b,c,d) and deduce something from >>>> >>>> ndistinct(b,c,d) / ndistinct(b,c) >>>> >>>> If we know how many distinct values we have for the predicate >>>> column, we >>>> could then estimate the number of groups. I mean, we know that for the >>>> restriction "WHERE b = 3" we only have 1 distinct value, so we could >>>> estimate the number of groups as >>>> >>>> 1 * ndistinct(b,c) >>> Did you mean here ndistinct(c,d) and the formula: >>> ndistinct(b,c,d) / ndistinct(c,d) ? >> >> Yes, I think that's probably a more correct ... Essentially, the idea is >> to estimate the change in number of distinct groups after adding a >> column (or restricting it in some way). > Thanks, I got it. I just think how to implement such techniques with > extensions just to test the idea in action. In the case of GROUP-BY we > can use path hook, of course. But what if to invent a hook on clauselist > estimation? Maybe. I have thought about introducing such hook to alter estimation of clauses, so I'm not opposed to it. Ofc, it depends on where would the hook be, what would it be allowed to do etc. And as it doesn't exist yet, it'd be more a "local" improvement to separate the changes into an extension. >>> Do you implicitly bear in mind here the necessity of tracking clauses >>> that were applied to the data up to the moment of grouping? >>> >> >> I don't recall what exactly I considered two months ago when writing the >> message, but I don't see why we would need to track that beyond what we >> already have. Shouldn't it be enough for the grouping to simply inspect >> the conditions on the lower levels? > Yes, exactly. I've thought about looking into baserestrictinfos and, if > group-by references a subquery targetlist, into subqueries too. > True. Something like that. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления: