[WIP Patch] Using 128-bit integers for sum, avg and statistics aggregates
От | Andreas Karlsson |
---|---|
Тема | [WIP Patch] Using 128-bit integers for sum, avg and statistics aggregates |
Дата | |
Msg-id | 544BB5F1.50709@proxel.se обсуждение исходный текст |
Ответы |
Re: [WIP Patch] Using 128-bit integers for sum, avg and
statistics aggregates
Re: [WIP Patch] Using 128-bit integers for sum, avg and statistics aggregates Re: Using 128-bit integers for sum, avg and statistics aggregates |
Список | pgsql-hackers |
Hi, There was recently talk about if we should start using 128-bit integers (where available) to speed up the aggregate functions over integers which uses numeric for their internal state. So I hacked together a patch for this to see what the performance gain would be. Previous thread: http://www.postgresql.org/message-id/20141017182500.GF2075@alap3.anarazel.de What the patch does is switching from using numerics in the aggregate state to int128 and then convert the type from the 128-bit integer in the final function. The functions where we can make use of int128 states are: - sum(int8) - avg(int8) - var_*(int2) - var_*(int4) - stdev_*(int2) - stdev_*(int4) The initial benchmark results look very promising. When summing 10 million int8 I get a speedup of ~2.5x and similarly for var_samp() on 10 million int4 I see a speed up of ~3.7x. To me this indicates that it is worth the extra code. What do you say? Is this worth implementing? The current patch still requires work. I have not written the detection of int128 support yet, and the patch needs code cleanup (for example: I used an int16_ prefix on the added functions, suggestions for better names are welcome). I also need to decide on what estimate to use for the size of that state. The patch should work and pass make check on platforms where __int128_t is supported. The simple benchmarks: CREATE TABLE test_int8 AS SELECT x::int8 FROM generate_series(1, 10000000) x; Before: # SELECT sum(x) FROM test_int8; sum ---------------- 50000005000000 (1 row) Time: 2521.217 ms After: # SELECT sum(x) FROM test_int8; sum ---------------- 50000005000000 (1 row) Time: 1022.811 ms CREATE TABLE test_int4 AS SELECT x::int4 FROM generate_series(1, 10000000) x; Before: # SELECT var_samp(x) FROM test_int4; var_samp -------------------- 8333334166666.6667 (1 row) Time: 3808.546 ms After: # SELECT var_samp(x) FROM test_int4; var_samp -------------------- 8333334166666.6667 (1 row) Time: 1033.243 ms Andreas
Вложения
В списке pgsql-hackers по дате отправления: