Re: planner chooses incremental but not the best one

Поиск
Список
Период
Сортировка
От Tomas Vondra
Тема Re: planner chooses incremental but not the best one
Дата
Msg-id 575d0bca-d235-4bb1-8901-3866e5f3c250@enterprisedb.com
обсуждение исходный текст
Ответ на Re: planner chooses incremental but not the best one  (Andrei Lepikhov <a.lepikhov@postgrespro.ru>)
Список pgsql-hackers

On 2/15/24 13:45, Andrei Lepikhov wrote:
> On 15/2/2024 18:10, Tomas Vondra wrote:
>>
>>
>> On 2/15/24 07:50, Andrei Lepikhov wrote:
>>> On 18/12/2023 19:53, Tomas Vondra wrote:
>>>> On 12/18/23 11:40, Richard Guo wrote:
>>>> The challenge is where to get usable information about correlation
>>>> between columns. I only have a couple very rought ideas of what might
>>>> try. For example, if we have multi-column ndistinct statistics, we
>>>> might
>>>> look at ndistinct(b,c) and ndistinct(b,c,d) and deduce something from
>>>>
>>>>       ndistinct(b,c,d) / ndistinct(b,c)
>>>>
>>>> If we know how many distinct values we have for the predicate
>>>> column, we
>>>> could then estimate the number of groups. I mean, we know that for the
>>>> restriction "WHERE b = 3" we only have 1 distinct value, so we could
>>>> estimate the number of groups as
>>>>
>>>>       1 * ndistinct(b,c)
>>> Did you mean here ndistinct(c,d) and the formula:
>>> ndistinct(b,c,d) / ndistinct(c,d) ?
>>
>> Yes, I think that's probably a more correct ... Essentially, the idea is
>> to estimate the change in number of distinct groups after adding a
>> column (or restricting it in some way).
> Thanks, I got it. I just think how to implement such techniques with
> extensions just to test the idea in action. In the case of GROUP-BY we
> can use path hook, of course. But what if to invent a hook on clauselist
> estimation?

Maybe.

I have thought about introducing such hook to alter estimation of
clauses, so I'm not opposed to it. Ofc, it depends on where would the
hook be, what would it be allowed to do etc. And as it doesn't exist
yet, it'd be more a "local" improvement to separate the changes into an
extension.

>>> Do you implicitly bear in mind here the necessity of tracking clauses
>>> that were applied to the data up to the moment of grouping?
>>>
>>
>> I don't recall what exactly I considered two months ago when writing the
>> message, but I don't see why we would need to track that beyond what we
>> already have. Shouldn't it be enough for the grouping to simply inspect
>> the conditions on the lower levels?
> Yes, exactly. I've thought about looking into baserestrictinfos and, if
> group-by references a subquery targetlist, into subqueries too.
> 

True. Something like that.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Geoghegan
Дата:
Сообщение: Re: index prefetching
Следующее
От: Bertrand Drouvot
Дата:
Сообщение: Re: Synchronizing slots from primary to standby