Re: POC: GROUP BY optimization

Поиск

Список

Период

Сортировка

От	Teodor Sigaev
Тема	Re: POC: GROUP BY optimization
Дата	7 июня 2018 г. 16:41:05
Msg-id	736c5ece-7ac4-241b-4a11-52797691238f@sigaev.ru обсуждение исходный текст
Ответ на	Re: POC: GROUP BY optimization (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Ответы	Re: POC: GROUP BY optimization
Список	pgsql-hackers

Дерево обсуждения

> So the costing was fairly trivial, we simply do something like
> 
>      comparison_cost = 2.0 * cpu_operator_cost;
> 
>      sort_cost = comparison_cost * tuples * LOG2(tuples);
> 
> which essentially ignores that there might be multiple columns, or that
> the columns may have sort operator with different costs.
Agree. And distribution of keys.
> 
> The question is how reliable the heuristics can be. The current patch
> uses just plain ndistinct, but that seems rather unreliable but I don't
> have a clear idea how to improve that - we may have MCV for the columns
> and perhaps some extended statistics, but I'm not sure how far we can
> run with that.
v8 already uses another algorithm.

> 
> Essentially what we need to estimate the number of comparisons for each
> column, to compute better comparison_cost.
Exactly

>> Priorization of the user-provided order can be as simple as giving
>> that comparison_cost a small handicap.
> 
> I see no point in doing that, and I don't recall a single place in the
> planner where we do that. If the user specified ORDER BY, we'll slap an
> explicit Sort on top when needed (which acts as the handicap, but in a
> clear way). Otherwise we don't do such things - it'd be just plain
> confusing (consider "ORDER BY a,b" vs. "ORDER BY b,c" with same data
> types, ndistinct etc. but unexpectedly different costs). Also, what
> would be a good value for the handicap?

Again agree. If we have fixed order of columns (ORDER BY) then we should not try 
to reorder it. Current patch follows that if I didn't a mistake.

-- 
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: POC: GROUP BY optimization