Обсуждение: autovacuum for large periodic deletes
Hi
Every night, a delete of about 50 million rows occurs on a table. Do we need to explicitly vacuum the table or will autovacuum handle the large deletes without affecting performance. I am trying to determine whether or not I should completely rely on autovacuum or also tinker with manual vacuums.
Sriram
Sriram Dandapani wrote: > Every night, a delete of about 50 million rows occurs on a table. Do we > need to explicitly vacuum the table or will autovacuum handle the large > deletes without affecting performance. I am trying to determine whether > or not I should completely rely on autovacuum or also tinker with manual > vacuums. How many rows stay in the table after the massive deletion? It may be that it's best for you to issue a CLUSTER or VACUUM FULL instead of VACUUM. (In any case, autovacuum would vacuum the table eventually, but maybe you can do better.) -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
About 5-10 million rows stay after deletion. There are a few other tables where the daily deletion totals about 3-6 million. Would a vacuum full/cluster affect other operations. These tables have a 24x7 high data insertion rate. -----Original Message----- From: Alvaro Herrera [mailto:alvherre@commandprompt.com] Sent: Wednesday, May 17, 2006 1:36 PM To: Sriram Dandapani Cc: pgsql-admin@postgresql.org Subject: Re: [ADMIN] autovacuum for large periodic deletes Sriram Dandapani wrote: > Every night, a delete of about 50 million rows occurs on a table. Do we > need to explicitly vacuum the table or will autovacuum handle the large > deletes without affecting performance. I am trying to determine whether > or not I should completely rely on autovacuum or also tinker with manual > vacuums. How many rows stay in the table after the massive deletion? It may be that it's best for you to issue a CLUSTER or VACUUM FULL instead of VACUUM. (In any case, autovacuum would vacuum the table eventually, but maybe you can do better.) -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
sdandapani@counterpane.com ("Sriram Dandapani") writes: > Every night, a delete of about 50 million rows occurs on a table. Do > we need to explicitly vacuum the table or will autovacuum handle the > large deletes without affecting performance. I am trying to > determine whether or not I should completely rely on autovacuum or > also tinker with manual vacuums.:p> I'd be inclined to submit a vacuum on the table as part of the process. Moreover, I'd be inclined to consider this situation to perhaps be something of a bug. Deleting enormous amounts of data from a table, on a regular basis, is not something I would expect would work out well. -- (reverse (concatenate 'string "moc.enworbbc" "@" "enworbbc")) http://cbbrowne.com/info/postgresql.html We are MICROS~1. You will be assimilated. Resistance is futile. (Attributed to B.G., Gill Bates)
On Wed, May 17, 2006 at 01:39:35PM -0700, Sriram Dandapani wrote: > About 5-10 million rows stay after deletion. There are a few other > tables where the daily deletion totals about 3-6 million. The default autovac settings will only vacuum a table after 40% of the rows are dead. That seems pretty high to me, so I regularly recommend cutting all the thresholds and scale factors in half. > Would a vacuum full/cluster affect other operations. These tables have a > 24x7 high data insertion rate. VACUUM FULL and CLUSTER both aquire exclusive locks. BTW, it sounds like table partitioning might be useful for you. It would allow you to setup a partition for each day, and then drop one day's worth of data very quickly. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
sdandapani@counterpane.com ("Sriram Dandapani") writes: > About 5-10 million rows stay after deletion. There are a few other > tables where the daily deletion totals about 3-6 million. It would appear there is something fairly not-sane about the process, then. You delete about 90% of the day's data from the table each day? That's a *lot*, and you are quite likely to have trouble with this table blowing out the Free Space Map as a result. > Would a vacuum full/cluster affect other operations. These tables > have a 24x7 high data insertion rate. Yes, VACUUM FULL and CLUSTER would block other operations while they run. The real Right Answer probably involves having data flow into some sort of "queue" table, created fresh each day, for that day's activities, where, at the end of the day, all of the data either gets purged or moved to the "final destination" table, so that a new table can be created, the next day. -- "cbbrowne","@","cbbrowne.com" http://cbbrowne.com/info/unix.html CBS News report on Fort Worth tornado damage: "Eight major downtown buildings were severely damaged and 1,000 homes were damaged, with 95 uninhabitable. Gov. George W. Bush declared Tarrant County a disaster area. Federal Emergency Management Agency workers are expected to arrive sometime next week after required paperwork is completed."
Looks like partitioning may seem the way to go. I have been hesitant about using partitioning as the feature is very new. How do I manage the free space map if I do not use partitioning? Thanks for your inputs Sriram -----Original Message----- From: pgsql-admin-owner@postgresql.org [mailto:pgsql-admin-owner@postgresql.org] On Behalf Of Chris Browne Sent: Wednesday, May 17, 2006 2:48 PM To: pgsql-admin@postgresql.org Subject: Re: [ADMIN] autovacuum for large periodic deletes sdandapani@counterpane.com ("Sriram Dandapani") writes: > About 5-10 million rows stay after deletion. There are a few other > tables where the daily deletion totals about 3-6 million. It would appear there is something fairly not-sane about the process, then. You delete about 90% of the day's data from the table each day? That's a *lot*, and you are quite likely to have trouble with this table blowing out the Free Space Map as a result. > Would a vacuum full/cluster affect other operations. These tables > have a 24x7 high data insertion rate. Yes, VACUUM FULL and CLUSTER would block other operations while they run. The real Right Answer probably involves having data flow into some sort of "queue" table, created fresh each day, for that day's activities, where, at the end of the day, all of the data either gets purged or moved to the "final destination" table, so that a new table can be created, the next day. -- "cbbrowne","@","cbbrowne.com" http://cbbrowne.com/info/unix.html CBS News report on Fort Worth tornado damage: "Eight major downtown buildings were severely damaged and 1,000 homes were damaged, with 95 uninhabitable. Gov. George W. Bush declared Tarrant County a disaster area. Federal Emergency Management Agency workers are expected to arrive sometime next week after required paperwork is completed." ---------------------------(end of broadcast)--------------------------- TIP 6: explain analyze is your friend
On Wed, May 17, 2006 at 03:18:45PM -0700, Sriram Dandapani wrote: > Looks like partitioning may seem the way to go. I have been hesitant > about using partitioning as the feature is very new. How do I manage the > free space map if I do not use partitioning? Make it very large. Or read the detailed description at http://lnk.nu/pervasivepostgres.com/9hg.aspx -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461