Re: Spilling hashed SetOps and aggregates to disk
От | David Rowley |
---|---|
Тема | Re: Spilling hashed SetOps and aggregates to disk |
Дата | |
Msg-id | CAKJS1f9VHga59dyU3tARyhYt-XYA899TzrzfqADGAoiKviSBUA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Spilling hashed SetOps and aggregates to disk (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
Ответы |
Re: Spilling hashed SetOps and aggregates to disk
Re: Spilling hashed SetOps and aggregates to disk |
Список | pgsql-hackers |
On 7 June 2018 at 08:11, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > On 06/06/2018 04:11 PM, Andres Freund wrote: >> Consider e.g. a scheme where we'd switch from hashed aggregation to >> sorted aggregation due to memory limits, but already have a number of >> transition values in the hash table. Whenever the size of the transition >> values in the hashtable exceeds memory size, we write one of them to the >> tuplesort (with serialized transition value). From then on further input >> rows for that group would only be written to the tuplesort, as the group >> isn't present in the hashtable anymore. >> > > Ah, so you're suggesting that during the second pass we'd deserialize > the transition value and then add the tuples to it, instead of building > a new transition value. Got it. Having to deserialize every time we add a new tuple sounds terrible from a performance point of view. Can't we just: 1. HashAgg until the hash table reaches work_mem. 2. Spill the entire table to disk. 3. Destroy the table and create a new one. 4. If more tuples: goto 1 5. Merge sort and combine each dumped set of tuples. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
В списке pgsql-hackers по дате отправления: