Re: cost_sort vs cost_agg
От | Ashutosh Bapat |
---|---|
Тема | Re: cost_sort vs cost_agg |
Дата | |
Msg-id | CAExHW5uCesfayyXeyncH1yKL8kK8rBuPsd0YoU04B2Usaw0xaQ@mail.gmail.com обсуждение исходный текст |
Ответ на | cost_sort vs cost_agg (Andy Fan <zhihui.fan1213@gmail.com>) |
Ответы |
Re: cost_sort vs cost_agg
|
Список | pgsql-hackers |
On Thu, Jan 14, 2021 at 7:12 PM Andy Fan <zhihui.fan1213@gmail.com> wrote: > > Currently the cost_sort doesn't consider the number of columns to sort, which > means the cost of SELECT * FROM t ORDER BY a; equals with the SELECT * > FROM t ORDER BY a, b; which is obviously wrong. The impact of this is when we > choose the plan for SELECT DISTINCT * FROM t ORDER BY c between: > > Sort > Sort Key: c > -> HashAggregate > Group Key: c, a, b, d, e, f, g, h, i, j, k, l, m, n > > and > > Unique > -> Sort > Sort Key: c, a, b, d, e, f, g, h, i, j, k, l, m, n > > > Since "Sort (c)" has the same cost as "Sort (c, a, b, d, e, f, g, h, i, j, k, > l, m, n)", and Unique node on a sorted input is usually cheaper than > HashAggregate, so the later one will win usually which might bad at many > places. I can imagine that HashAggregate + Sort will perform better if there are very few distinct rows but otherwise, Unique on top of Sort would be a better strategy since it doesn't need two operations. > > Optimizer chose HashAggregate with my patch, but it takes 6s. after set > enable_hashagg = off, it takes 2s. This example actually shows that using Unique is better than HashAggregate + Sort. May be you want to try with some data which has very few distinct rows. -- Best Wishes, Ashutosh Bapat
В списке pgsql-hackers по дате отправления: