Block level parallel vacuum WIP
От | Masahiko Sawada |
---|---|
Тема | Block level parallel vacuum WIP |
Дата | |
Msg-id | CAD21AoD1xAqp4zK-Vi1cuY3feq2oO8HcpJiz32UDUfe0BE31Xw@mail.gmail.com обсуждение исходный текст |
Ответы |
Re: Block level parallel vacuum WIP
Re: Block level parallel vacuum WIP Re: Block level parallel vacuum WIP Re: Block level parallel vacuum WIP |
Список | pgsql-hackers |
Hi all, I'd like to propose block level parallel VACUUM. This feature makes VACUUM possible to use multiple CPU cores. Vacuum Processing Logic =================== PostgreSQL VACUUM processing logic consists of 2 phases, 1. Collecting dead tuple locations on heap. 2. Reclaiming dead tuples from heap and indexes. These phases 1 and 2 are executed alternately, and once amount of dead tuple location reached maintenance_work_mem in phase 1, phase 2 will be executed. Basic Design ========== As for PoC, I implemented parallel vacuum so that each worker processes both 1 and 2 phases for particular block range. Suppose we vacuum 1000 blocks table with 4 workers, each worker processes 250 consecutive blocks in phase 1 and then reclaims dead tuples from heap and indexes (phase 2). To use visibility map efficiency, each worker scan particular block range of relation and collect dead tuple locations. After each worker finished task, the leader process gathers these vacuum statistics information and update relfrozenxid if possible. I also changed the buffer lock infrastructure so that multiple processes can wait for cleanup lock on a buffer. And the new GUC parameter vacuum_parallel_workers controls the number of vacuum workers. Performance(PoC) ========= I ran parallel vacuum on 13GB table (pgbench scale 1000) with several workers (on my poor virtual machine). The result is, 1. Vacuum whole table without index (disable page skipping) 1 worker : 33 sec 2 workers : 27 sec 3 workers : 23 sec 4 workers : 22 sec 2. Vacuum table and index (after 10000 transaction executed) 1 worker : 12 sec 2 workers : 49 sec 3 workers : 54 sec 4 workers : 53 sec As a result of my test, since multiple process could frequently try to acquire the cleanup lock on same index buffer, execution time of parallel vacuum got worse. And it seems to be effective for only table vacuum so far, but is not improved as expected (maybe disk bottleneck). Another Design ============ ISTM that processing index vacuum by multiple process is not good idea in most cases because many index items can be stored in a page and multiple vacuum worker could try to require the cleanup lock on the same index buffer. It's rather better that multiple workers process particular block range and then multiple workers process each particular block range, and then one worker per index processes index vacuum. Still lots of work to do but attached PoC patch. Feedback and suggestion are very welcome. Regards, -- Masahiko Sawada
Вложения
В списке pgsql-hackers по дате отправления: