Re: Aggregate ORDER BY patch

Поиск

Список

Период

Сортировка

От	Andrew Gierth
Тема	Re: Aggregate ORDER BY patch
Дата	15 ноября 2009 г. 19:24:19
Msg-id	871vjzl82a.fsf@news-spur.riddles.org.uk обсуждение исходный текст
Ответ на	Re: Aggregate ORDER BY patch (Andrew Gierth <andrew@tao11.riddles.org.uk>)
Ответы	Re: Aggregate ORDER BY patch
Список	pgsql-hackers

Дерево обсуждения

>>>>> "Andrew" == Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
Andrew> Performance.
Andrew> tuplesort_getdatum etc. seems to be substantially faster thanAndrew> tuplesort_gettupleslot especially for the
casewhere you'reAndrew> sorting a pass-by-value datum such as an integer (since theAndrew> datum is then stored only in
thesort tuple header andAndrew> doesn't require a separate space allocation forAndrew> itself). Using a slot in all
caseswould have slowed downAndrew> some common cases like count(distinct id) by a measurableAndrew> amount.
 
Andrew> Cases like array_agg(x order by x) benefit from the fasterAndrew> code path too.
Andrew> The memory management between the two cases is sufficientlyAndrew> different that combining them into one
functionwhile stillAndrew> maintaining the slot vs. datum distinction would be ugly andAndrew> probably error-prone.
Therelatively minor duplication ofAndrew> logic seemed much clearer to me.
 

Just to quantify this, using a production-quality build (optimized and
without assertions), it turns out that the fast code path
(process_ordered_aggregate_single) is faster by 300% for pass-by-value
types, and by approximately 20% for short values of pass-by-reference
types, as compared to disabling that code path and forcing even the
one-arg case to use the slot interface.

So using the slot interface for everything would have constituted a
300% slowdown over the older code for count(distinct id), obviously
undesirable.

As it stands, I can't detect any performance regression over the
previous code.

This means that agg(x order by y) is rather noticably slower than
agg(x order by x), but this is pretty much unavoidable given how the
sorting code works.

Future performance enhancements (which I have no particular plans to
tackle) would involve having the planner consult the desired aggregate
orderings and estimating the cost of sorting as opposed to the cost of
producing a plan with the input already ordered. Also combining the
sort step for aggregates that share a single ordering.

-- 
Andrew (irc:RhodiumToad)

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Aggregate ORDER BY patch