Re: Adjusting hash join memory limit to handle batch explosion

Поиск

Список

Период

Сортировка

От	Tomas Vondra
Тема	Re: Adjusting hash join memory limit to handle batch explosion
Дата	9 января 20:01:55
Msg-id	36af890e-464d-41ae-88a8-d06317f36a98@vondra.me обсуждение исходный текст
Ответ на	Re: Adjusting hash join memory limit to handle batch explosion (Melanie Plageman <melanieplageman@gmail.com>)
Список	pgsql-hackers

Дерево обсуждения


On 1/9/25 17:17, Melanie Plageman wrote:
> On Tue, Dec 31, 2024 at 6:07 PM Tomas Vondra <tomas@vondra.me> wrote:
>>
>> So I decided to revisit the three patches from 2019. Attached are
>> rebased and cleaned up versions. A couple comments on each one:
>>
>>
>> 1) v20241231-adjust-limit-0001-Account-for-batch-files-in-ha.patch
>>
>> I believe this is the way to go, for now. The basic idea is to keep the
>> overall behavior, but "relax" the memory limit as the number of batches
>> increases to minimize the total memory use.
>>
>> This may seem a bit weird, but as the number of batches grows there's no
>> way to not violate the limit. And the current code simply ignores this
>> and allocates arbitrary amounts of memory.
> 
> I'm just catching up on this thread and haven't read all the mails
> yet. I started with looking at the patches in the first email and got
> a bit confused.
> 
> In this patch (v20241231-adjust-limit-0001-Account-for-batch-files-in-ha.patch),
> I see that you've started accounting for the spill files in
> hashtable->spaceUsed -- in the same way that is done for the tuples in
> the hashtable. I know the other memory contexts (hashCxt and batchCxt)
> in hashtable aren't appropriate for figuring out spaceUsed, but I was
> wondering if the hashtable->spillCxt accurately reflects how much
> memory is being used for these spill files at one time? Perhaps it
> doesn't make sense to use this, but when we added it in 8c4040edf45, I
> thought we might one day be able to use it for determining peak space
> usage. Or perhaps you are imagining it only be used for observability?
> 

Good question. Yes, the patch from 12/31 does look at all the memory,
including the batch files. I thought about using the spillCtx too, but I
don't think it it would work because the context tracks *current* memory
usage, and we're interested in how much memory would be used *after*
doubling the number of batches.


regards

-- 
Tomas Vondra

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Adjusting hash join memory limit to handle batch explosion