Re: Reduce palloc's in numeric operations.
От | Heikki Linnakangas |
---|---|
Тема | Re: Reduce palloc's in numeric operations. |
Дата | |
Msg-id | 5059B87A.2070305@vmware.com обсуждение исходный текст |
Ответ на | Reduce palloc's in numeric operations. (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>) |
Ответы |
Re: Reduce palloc's in numeric operations.
|
Список | pgsql-hackers |
On 14.09.2012 11:25, Kyotaro HORIGUCHI wrote: > Hello, I will propose reduce palloc's in numeric operations. > > The numeric operations are slow by nature, but usually it is not > a problem for on-disk operations. Altough the slowdown is > enhanced on on-memory operations. > > I inspcted them and found some very short term pallocs. These > palloc's are used for temporary storage for digits of unpaked > numerics. > > The formats of numeric digits in packed and unpaked forms are > same. So we can kicked out a part of palloc's using digits in > packed numeric in-place to make unpakced one. > > In this patch, I added new function set_var_from_num_nocopy() to > do this. And make use of it for operands which won't modified. Have to be careful to really not modify the operands. numeric_out() and numeric_out_sci() are wrong; they call get_str_from_var(), which modifies the argument. Same with numeric_expr(): it passes the argument to numericvar_to_double_no_overflow(), which passes it to get_str_from_var(). numericvar_to_int8() also modifies its argument, so all the functions that use that, directly or indirectly, must make a copy. Perhaps get_str_from_var(), and the other functions that currently scribble on the arguments, should be modified to not do so. They could easily make a copy of the argument within the function. Then the callers could safely use set_var_from_num_nocopy(). The performance would be the same, you would have the same number of pallocs, but you would get rid of the surprising argument-modifying behavior of those functions. > The performance gain seems quite moderate.... > > 'SELECT SUM(numeric_column) FROM on_memory_table' for ten million > rows and about 8 digits numeric runs for 3480 ms aganst original > 3930 ms. It's 11% gain. 'SELECT SUM(int_column) FROM > on_memory_table' needed 1570 ms. > > Similary 8% gain for about 30 - 50 digits numeric. Performance of > avg(numeric) made no gain in contrast. > > Do you think this worth doing? Yes, I think this is worthwhile. I'm seeing an even bigger gain, with smaller numerics. I created a table with this: CREATE TABLE numtest AS SELECT a::numeric AS col FROM generate_series(1, 10000000) a; And repeated this query with \timing: SELECT SUM(col) FROM numtest; The execution time of that query fell from about 5300 ms to 4300 ms, ie. about 20%. - Heikki
В списке pgsql-hackers по дате отправления: