Optimizing aggregates
От | Heikki Linnakangas |
---|---|
Тема | Optimizing aggregates |
Дата | |
Msg-id | 5561626c-4851-a717-79b5-bed3a192574e@iki.fi обсуждение исходный текст |
Ответы |
Re: Optimizing aggregates
|
Список | pgsql-hackers |
I've been profiling simple aggregate queries, looking for any low-hanging fruit. For this query: -- setup create table floats as select g::float8 as a, g::float8 as b, g::float8 as c from generate_series(1, 10000000) g; vacuum freeze floats; -- query select sum(a), sum(b+c) from floats; perf report says: # Children Self Command Shared Object Symbol # ........ ........ .......... ................. ........................................ # 25.70% 0.00% postmaster [unknown] [k] 0000000000000000 14.23% 13.75% postmaster postgres [.] ExecProject 11.18% 10.57% postmaster postgres [.] slot_deform_tuple 9.58% 9.04% postmaster postgres [.] advance_aggregates 8.96% 0.00% postmaster [unknown] [.] 0x00000000000298d4 8.77% 8.42% postmaster postgres [.] ExecMakeFunctionResultNoSets 7.78% 0.00% postmaster [unknown] [.] 0x0000000001d38260 6.63% 6.15% postmaster postgres [.] advance_transition_function 6.61% 0.00% postmaster [unknown] [.] 0x0000000001e99e40 6.47% 0.00% postmaster libc-2.23.so [.] __GI___libc_read 6.24% 5.88% postmaster postgres [.] heap_getnext 4.62% 4.62% postmaster [kernel.kallsyms] [k] copy_user_enhanced_fast_string 3.91% 3.82% postmaster postgres [.] slot_getsomeattrs 3.29% 3.18% postmaster postgres [.] slot_getattr 3.06% 3.00% postmaster postgres [.] ExecClearTuple 2.59% 0.00% postmaster [unknown] [.] 0x0000000001e9a370 2.57% 2.45% postmaster postgres [.] ExecScan 2.56% 2.37% postmaster postgres [.] float8pl 2.54% 2.43% postmaster postgres [.] heapgetpage 2.25% 2.17% postmaster postgres [.] ExecAgg 2.10% 1.96% postmaster postgres [.] ExecStoreTuple 2.00% 1.91% postmaster postgres [.] ExecProcNode ExecProject stands out. I find that pretty surprising. We're using ExecProject to extract the arguments from the input tuples, to pass to the aggregate transition functions. It looks like that's a pretty expensive way of doing it, for a typical aggregate that takes only one argument. We actually used to call ExecEvalExpr() directly for each argument, but that was changed by the patch that added support for ordered set aggregates. It looks like that was a bad idea, from a performance point of view. I propose that we go back to calling ExecEvalExpr() directly, for non-ordered aggregates, per the attached patch. That makes that example query about 10% faster on my laptop, which is in line with the fact that ExecProject() accounted for about 13% of the CPU time. Another idea is that maybe we should add a fast-path to ExecProject(), for these trivial cases. - Heikki
Вложения
В списке pgsql-hackers по дате отправления: