Re: cost based vacuum (parallel)

Поиск

Список

Период

Сортировка

От	Amit Kapila
Тема	Re: cost based vacuum (parallel)
Дата	4 ноября 2019 г. 10:35:29
Msg-id	CAA4eK1+x0PRyG95fhbbSukZP+=im6OYANDJ0KPf74iujLq_F4w@mail.gmail.com обсуждение исходный текст
Ответ на	Re: cost based vacuum (parallel) (Darafei "Komяpa" Praliaskouski <me@komzpa.net>)
Список	pgsql-hackers

Дерево обсуждения

On Mon, Nov 4, 2019 at 1:03 PM Darafei "Komяpa" Praliaskouski
<me@komzpa.net> wrote:
>>
>>
>> This is somewhat similar to a memory usage problem with a
>> parallel query where each worker is allowed to use up to work_mem of
>> memory.  We can say that the users using parallel operation can expect
>> more system resources to be used as they want to get the operation
>> done faster, so we are fine with this.  However, I am not sure if that
>> is the right thing, so we should try to come up with some solution for
>> it and if the solution is too complex, then probably we can think of
>> documenting such behavior.
>
>
> In cloud environments (Amazon + gp2) there's a budget on input/output operations. If you cross it for long time,
everythingstarts looking like you work with a floppy disk. 
>
> For the ease of configuration, I would need a "max_vacuum_disk_iops" that would limit number of input-output
operationsby all of the vacuums in the system. If I set it to less than value of budget refill, I can be sure than that
novacuum runs too fast to impact any sibling query. 
>
> There's also value in non-throttled VACUUM for smaller tables. On gp2 such things will be consumed out of surge
budget,and its size is known to sysadmin. Let's call it "max_vacuum_disk_surge_iops" - if a relation has less blocks
thanthis value and it's a blocking in any way situation (antiwraparound, interactive console, ...) - go on and run
withoutthrottling. 
>

I think the need for these things can be addressed by current
cost-based-vacuum parameters. See docs [1].  For example, if you set
vacuum_cost_delay as zero, it will allow the operation to be performed
without throttling.

> For how to balance the cost: if we know a number of vacuum processes that were running in the previous second, we can
justdivide a slot for this iteration by that previous number. 
>
> To correct for overshots, we can subtract the previous second's overshot from next one's. That would also allow to
accountfor surge budget usage and let it refill, pausing all autovacuum after a manual one for some time. 
>
> Precision of accounting limiting count of operations more than once a second isn't beneficial for this use case.
>

I think it is better if we find a way to rebalance the cost on some
worker exit rather than every second as anyway it won't change unless
any worker exits.

[1] - https://www.postgresql.org/docs/devel/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-VACUUM-COST

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: cost based vacuum (parallel)