Re: Add bump memory context type and use it for tuplesorts
От | Tomas Vondra |
---|---|
Тема | Re: Add bump memory context type and use it for tuplesorts |
Дата | |
Msg-id | 4932ce3f-78f2-4612-b0f2-93d58ac53b84@enterprisedb.com обсуждение исходный текст |
Ответ на | Re: Add bump memory context type and use it for tuplesorts (David Rowley <dgrowleyml@gmail.com>) |
Ответы |
Re: Add bump memory context type and use it for tuplesorts
|
Список | pgsql-hackers |
On 3/12/24 00:40, David Rowley wrote: > On Tue, 12 Mar 2024 at 12:25, Tomas Vondra > <tomas.vondra@enterprisedb.com> wrote: >> (b) slab is considerably slower > > It would be interesting to modify SlabReset() to, instead of free()ing > the blocks, push the first SLAB_MAXIMUM_EMPTY_BLOCKS of them onto the > emptyblocks list. > > That might give us an idea of how much overhead comes from malloc/free. > > Having something like this as an option when creating a context might > be a good idea. generation.c now keeps 1 "freeblock" which currently > does not persist during context resets. Some memory context usages > might suit having an option like this. Maybe something like the > executor's per-tuple context, which perhaps (c|sh)ould be a generation > context... However, saying that, I see you measure it to be slightly > slower than aset. > IIUC you're suggesting maybe it's a problem we free the blocks during context reset, only to allocate them again shortly after, paying the malloc overhead. This reminded the mempool idea I recently shared in the nearby "scalability bottlenecks" thread [1]. So I decided to give this a try and see how it affects this benchmark. Attached is an updated version of the mempool patch, modifying all the memory contexts (not just AllocSet), including the bump context. And then also PDF with results from the two machines, comparing results without and with the mempool. There's very little impact on small reset values (128kB, 1MB), but pretty massive improvements on the 8MB test (where it's a 2x improvement). Nevertheless, it does not affect the relative performance very much. The bump context is still the fastest, but the gap is much smaller. Considering the mempool serves as a cache in between memory contexts and glibc, eliminating most of the malloc/free calls, and essentially keeping the blocks allocated, I doubt slab is slow because of malloc overhead - at least in the "small" tests (but I haven't looked closer). regards [1] https://www.postgresql.org/message-id/510b887e-c0ce-4a0c-a17a-2c6abb8d9a5c%40enterprisedb.com -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Вложения
В списке pgsql-hackers по дате отправления: