Re: cost_sort vs cost_agg

Поиск

Список

Период

Сортировка

От	Andy Fan
Тема	Re: cost_sort vs cost_agg
Дата	8 февраля 2021 г. 11:04:46
Msg-id	CAKU4AWoiHnE8B7BTJ9JCarJv7_b+bgr5Le=yYnuk48HtJ6swSg@mail.gmail.com обсуждение исходный текст
Ответ на	Re: cost_sort vs cost_agg (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>)
Список	pgsql-hackers

Дерево обсуждения

Thank you Ashutosh.

On Fri, Jan 15, 2021 at 7:18 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote:

On Thu, Jan 14, 2021 at 7:12 PM Andy Fan <zhihui.fan1213@gmail.com> wrote:
>
> Currently the cost_sort doesn't consider the number of columns to sort, which
> means the cost of SELECT * FROM t ORDER BY a; equals with the SELECT *
> FROM t ORDER BY a, b; which is obviously wrong. The impact of this is when we
> choose the plan for SELECT DISTINCT * FROM t ORDER BY c between:
>
> Sort
> Sort Key: c
> -> HashAggregate
> Group Key: c, a, b, d, e, f, g, h, i, j, k, l, m, n
>
> and
>
> Unique
> -> Sort
> Sort Key: c, a, b, d, e, f, g, h, i, j, k, l, m, n
>
>
> Since "Sort (c)" has the same cost as "Sort (c, a, b, d, e, f, g, h, i, j, k,
> l, m, n)", and Unique node on a sorted input is usually cheaper than
> HashAggregate, so the later one will win usually which might bad at many
> places.

I can imagine that HashAggregate + Sort will perform better if there
are very few distinct rows but otherwise, Unique on top of Sort would
be a better strategy since it doesn't need two operations.

Thanks for the hint, I will consider the distinct rows as a factor in the next

patch.

>
> Optimizer chose HashAggregate with my patch, but it takes 6s. after set
> enable_hashagg = off, it takes 2s.

This example actually shows that using Unique is better than
HashAggregate + Sort. May be you want to try with some data which has
very few distinct rows.

Best Regards

Andy Fan (https://www.aliyun.com/)

В списке pgsql-hackers по дате отправления:

Предыдущее

От: Greg Nancarrow
Дата: 08 февраля 2021 г., 11:04:27
Сообщение: Re: Parallel INSERT (INTO ... SELECT ...)

Следующее

От: "Tang, Haiying"
Дата: 08 февраля 2021 г., 11:12:40
Сообщение: RE: Parallel INSERT (INTO ... SELECT ...)

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: cost_sort vs cost_agg

Предыдущее

Следующее