RE: Recommendation to run vacuum FULL in parallel
От | Kevin Brannen |
---|---|
Тема | RE: Recommendation to run vacuum FULL in parallel |
Дата | |
Msg-id | MN2PR19MB2575026251A1DF43BEDC5066A4500@MN2PR19MB2575.namprd19.prod.outlook.com обсуждение исходный текст |
Ответ на | Re: Recommendation to run vacuum FULL in parallel (Perumal Raj <perucinci@gmail.com>) |
Ответы |
Re: Recommendation to run vacuum FULL in parallel
|
Список | pgsql-general |
From: Perumal Raj <perucinci@gmail.com>
So conclude the requirement here , The only way to parallelism is multiple script. And no need to do REINDEX exclusively.
Question : Do we need to consider Table dependencies while preparing script in order to avoid table locks during vacuum full ?
We have a small bash script (see below) that get the list of tables and their sizes, sorted smallest to largest, and do “vacuum full” one at a time because (as someone else pointed out) this is very I/O intensive. That order also helps to ensure we finish because some of our installs are at the edge of running out of space (an issue we’re dealing with). I probably wouldn’t have a problem doing 2 at a time, but we do this in the middle of the night when activity is lowest and it only takes 1-2 hours, so we’re good with it. It sounds like you have a lot more data though.
You might also consider putting the data into different tablespaces which are spread over multiple disks to help I/O. If you can, use SSD drives, they help with speed quite a bit. 😊
Don’t worry about table dependencies. This is a physical operation, not a data operation.
HTH,
Kevin
$PGPATH/psql -t -c "
WITH s AS (SELECT nspname || '.' || relname AS TABLE_NAME, pg_total_relation_size(c.oid) AS total_bytes
FROM pg_class c
LEFT JOIN pg_namespace n ON n.oid = c.relnamespace
WHERE relkind = 'r' AND nspname NOT IN ( 'pg_catalog', 'information_schema' )
ORDER BY 2 )
SELECT table_name FROM s
" |
while read t ; do echo "" ; echo $t; $PGPATH/vacuumdb -w -z -f -t $t ; done
###
В списке pgsql-general по дате отправления: