Re: Memory-Bounded Hash Aggregation
От | Adam Lee |
---|---|
Тема | Re: Memory-Bounded Hash Aggregation |
Дата | |
Msg-id | 20191220091626.GD45905@mars.local обсуждение исходный текст |
Ответ на | Re: Memory-Bounded Hash Aggregation (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
Список | pgsql-hackers |
On Sat, Dec 14, 2019 at 06:32:25PM +0100, Tomas Vondra wrote: > I've done a bit more testing on this, after resolving a couple of minor > conflicts due to recent commits (rebased version attached). > > In particular, I've made a comparison with different dataset sizes, > group sizes, GUC settings etc. The script and results from two different > machines are available here: > > The script essentially runs a simple grouping query with different > number of rows, groups, work_mem and parallelism settings. There's > nothing particularly magical about it. Nice! > I did run it both on master and patched code, allowing us to compare > results and assess impact of the patch. Overall, the changes are > expected and either neutral or beneficial, i.e. the timing are the same > or faster. > > The number of cases that regressed is fairly small, but sometimes the > regressions are annoyingly large - up to 2x in some cases. Consider for > example this trivial example with 100M rows: I suppose this is because the patch has no costing changes yet. I hacked a little to give hash agg a spilling punish, just some value based on (groups_in_hashtable * num_of_input_tuples)/num_groups_from_planner, it would not choose hash aggregate in this case. However, that punish is wrong, because comparing to the external sort algorithm, hash aggregate has the respilling, which involves even more I/O, especially with a very large number of groups but a very small number of tuples in a single group like the test you did. It would be a challenge. BTW, Jeff, Greenplum has a test for hash agg spill, I modified a little to check how many batches a query uses, it's attached, not sure if it would help. -- Adam Lee
Вложения
В списке pgsql-hackers по дате отправления: