Re: Aggregate ORDER BY patch
От | Andrew Gierth |
---|---|
Тема | Re: Aggregate ORDER BY patch |
Дата | |
Msg-id | 871vjzl82a.fsf@news-spur.riddles.org.uk обсуждение исходный текст |
Ответ на | Re: Aggregate ORDER BY patch (Andrew Gierth <andrew@tao11.riddles.org.uk>) |
Ответы |
Re: Aggregate ORDER BY patch
|
Список | pgsql-hackers |
>>>>> "Andrew" == Andrew Gierth <andrew@tao11.riddles.org.uk> writes: Andrew> Performance. Andrew> tuplesort_getdatum etc. seems to be substantially faster thanAndrew> tuplesort_gettupleslot especially for the casewhere you'reAndrew> sorting a pass-by-value datum such as an integer (since theAndrew> datum is then stored only in thesort tuple header andAndrew> doesn't require a separate space allocation forAndrew> itself). Using a slot in all caseswould have slowed downAndrew> some common cases like count(distinct id) by a measurableAndrew> amount. Andrew> Cases like array_agg(x order by x) benefit from the fasterAndrew> code path too. Andrew> The memory management between the two cases is sufficientlyAndrew> different that combining them into one functionwhile stillAndrew> maintaining the slot vs. datum distinction would be ugly andAndrew> probably error-prone. Therelatively minor duplication ofAndrew> logic seemed much clearer to me. Just to quantify this, using a production-quality build (optimized and without assertions), it turns out that the fast code path (process_ordered_aggregate_single) is faster by 300% for pass-by-value types, and by approximately 20% for short values of pass-by-reference types, as compared to disabling that code path and forcing even the one-arg case to use the slot interface. So using the slot interface for everything would have constituted a 300% slowdown over the older code for count(distinct id), obviously undesirable. As it stands, I can't detect any performance regression over the previous code. This means that agg(x order by y) is rather noticably slower than agg(x order by x), but this is pretty much unavoidable given how the sorting code works. Future performance enhancements (which I have no particular plans to tackle) would involve having the planner consult the desired aggregate orderings and estimating the cost of sorting as opposed to the cost of producing a plan with the input already ordered. Also combining the sort step for aggregates that share a single ordering. -- Andrew (irc:RhodiumToad)
В списке pgsql-hackers по дате отправления: