Re: Memory-Bounded Hash Aggregation
От | Jeff Davis |
---|---|
Тема | Re: Memory-Bounded Hash Aggregation |
Дата | |
Msg-id | 3bed775951a72dbdf8446c6750936e38dcf107b9.camel@j-davis.com обсуждение исходный текст |
Ответ на | Re: Memory-Bounded Hash Aggregation (Jeff Davis <pgsql@j-davis.com>) |
Ответы |
Re: Memory-Bounded Hash Aggregation
Re: Memory-Bounded Hash Aggregation |
Список | pgsql-hackers |
On Mon, 2020-02-10 at 15:57 -0800, Jeff Davis wrote: > Attaching latest version (combined logtape changes along with main > HashAgg patch). I ran a matrix of small performance tests to look for regressions. The goal was to find out if the refactoring or additional branches introduced by this patch caused regressions in in-memory HashAgg, Sort, or the JIT paths. Fortunately, I didn't find any. This is *not* supposed to represent the performance benefits of the patch, only to see if I regressed somewhere else. The performance benefits will be shown in the next round of tests. I tried with JIT on/off, work_mem='4MB' and also a value high enough to fit the entire working set, enable_hashagg on/off, and 4 different tables. The 4 tables are (each containing 20 million tuples): t1k_20k_int4: 1K groups of 20K tuples each (randomly generated and ordered) t20m_1_int4: 20M groups of 1 tuple each (randomly generated and ordered) t1k_20k_text: the same as t1k_20k_int4 but cast to text (collation C.UTF-8) t20m_1_text: the same as t20m_1_int4 but cast to text (collation C.UTF-8) The query is: select count(*) from (select i, count(*) from $TABLE group by i) s; I just did 3 runs in psql and took the median result. I ran against master (cac8ce4a, slightly older, before any of my patches went in) and my dev branch (attached patch applied against 0973f560). Results were pretty boring, in a good way. All results within the noise, and about as many results were better on dev than master as there were better on master than dev. I also did some JIT-specific tests against only t1k_20k_int4. For that, the hash table fits in memory anyway, so I didn't vary work_mem. The query I ran included more aggregates to better test JIT: select i, sum(i), avg(i), min(i) from t1k_20k_int4 group by i offset 1000000; -- offset so it doesn't return result I know these tests are simplistic, but I also think they represent a lot of areas where regressions could have potentially been introduced. If someone else can find a regression, please let me know. The new patch is basically just rebased -- a few other very minor changes. Regards, Jeff Davis
Вложения
В списке pgsql-hackers по дате отправления: