Re: Memory-Bounded Hash Aggregation
От | Jeff Davis |
---|---|
Тема | Re: Memory-Bounded Hash Aggregation |
Дата | |
Msg-id | 2370ca86d4c3b46befc031d8421de02a58df20f1.camel@j-davis.com обсуждение исходный текст |
Ответ на | Re: Memory-Bounded Hash Aggregation (Adam Lee <ali@pivotal.io>) |
Ответы |
Re: Memory-Bounded Hash Aggregation
|
Список | pgsql-hackers |
On Tue, 2019-12-10 at 13:34 -0800, Adam Lee wrote: > Melanie and I tried this, had a installcheck passed patch. The way > how > we verify it is composing a wide table with long unnecessary text > columns, then check the size it writes on every iteration. > > Please check out the attachment, it's based on your 1204 version. Thank you. Attached a new patch that incorporates your projection work. A few comments: * You are only nulling out up to tts_nvalid, which means that you can still end up storing more on disk if the wide column comes at the end of the table and hasn't been deserialized yet. I fixed this by copying needed attributes to the hash_spill_slot and making it virtual. * aggregated_columns does not need to be a member of AggState; nor does it need to be computed inside of the perhash loop. Aside: if adding a field to AggState is necessary, you need to bump the field numbers of later fields that are labeled for JIT use, otherwise it will break JIT. * I used an array rather than a bitmapset. It makes it easier to find the highest column (to do a slot_getsomeattrs), and it might be a little more efficient for wide tables with mostly useless columns. * Style nitpick: don't mix code and declarations The updated patch also saves the transitionSpace calculation in the Agg node for better hash table size estimating. This is a good way to choose an initial number of buckets for the hash table, and also to cap the number of groups we permit in the hash table when we expect the groups to grow. Regards, Jeff Davis
Вложения
В списке pgsql-hackers по дате отправления: