Re: cost based vacuum (parallel)
От | Amit Kapila |
---|---|
Тема | Re: cost based vacuum (parallel) |
Дата | |
Msg-id | CAA4eK1+x0PRyG95fhbbSukZP+=im6OYANDJ0KPf74iujLq_F4w@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: cost based vacuum (parallel) (Darafei "Komяpa" Praliaskouski <me@komzpa.net>) |
Список | pgsql-hackers |
On Mon, Nov 4, 2019 at 1:03 PM Darafei "Komяpa" Praliaskouski <me@komzpa.net> wrote: >> >> >> This is somewhat similar to a memory usage problem with a >> parallel query where each worker is allowed to use up to work_mem of >> memory. We can say that the users using parallel operation can expect >> more system resources to be used as they want to get the operation >> done faster, so we are fine with this. However, I am not sure if that >> is the right thing, so we should try to come up with some solution for >> it and if the solution is too complex, then probably we can think of >> documenting such behavior. > > > In cloud environments (Amazon + gp2) there's a budget on input/output operations. If you cross it for long time, everythingstarts looking like you work with a floppy disk. > > For the ease of configuration, I would need a "max_vacuum_disk_iops" that would limit number of input-output operationsby all of the vacuums in the system. If I set it to less than value of budget refill, I can be sure than that novacuum runs too fast to impact any sibling query. > > There's also value in non-throttled VACUUM for smaller tables. On gp2 such things will be consumed out of surge budget,and its size is known to sysadmin. Let's call it "max_vacuum_disk_surge_iops" - if a relation has less blocks thanthis value and it's a blocking in any way situation (antiwraparound, interactive console, ...) - go on and run withoutthrottling. > I think the need for these things can be addressed by current cost-based-vacuum parameters. See docs [1]. For example, if you set vacuum_cost_delay as zero, it will allow the operation to be performed without throttling. > For how to balance the cost: if we know a number of vacuum processes that were running in the previous second, we can justdivide a slot for this iteration by that previous number. > > To correct for overshots, we can subtract the previous second's overshot from next one's. That would also allow to accountfor surge budget usage and let it refill, pausing all autovacuum after a manual one for some time. > > Precision of accounting limiting count of operations more than once a second isn't beneficial for this use case. > I think it is better if we find a way to rebalance the cost on some worker exit rather than every second as anyway it won't change unless any worker exits. [1] - https://www.postgresql.org/docs/devel/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-VACUUM-COST -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
В списке pgsql-hackers по дате отправления: