Обсуждение: vacuum and table locking
Hi all, I have a question regarding vacuum and locking, but since I currently don't have access to my servers (maintenance activities),I can't verify on my own. I know that VACUUM has at least four variants. OK, actually two with ability to add ANALYZE after the operation. Here we go: VACUUM (VACUUM ANALYZE) - shouldn't need to lock the table(?). I'm looking at the Douglas' book "PostgreSQL" on page 804where it claims the operation does not require exclusive access. VACUUM FULL (VACUUM FULL ANALYZE) - will need to lock the table, but I'm thinking this may only take a couple of millisecondsas I think this builds a temporary table from the one being vacuumed and then when complete, drops the old table and renames the temporary. So the lock should only be during thismove, or does it lock the table for the entirety? Is this correct? And how does it then resolve say another deletion in the table that occurs while the VACUUM is in operation?I hope that is clearly expressed. -- Jay
this may only take a couple of milliseconds as I think this builds a temporary table from the one
being vacuumed and then when complete, drops the old table and renames the temporary.
I think pg_repack does that, and hence can remove bloat without locking the whole table. https://github.com/reorg/pg_repack
On Fri, Apr 3, 2015 at 11:59 AM, John Scalia <jayknowsunix@gmail.com> wrote:
Hi all,
I have a question regarding vacuum and locking, but since I currently don't have access to my servers (maintenance activities), I can't verify on my own. I know that VACUUM has at
least four variants. OK, actually two with ability to add ANALYZE after the operation. Here we go:
VACUUM (VACUUM ANALYZE) - shouldn't need to lock the table(?). I'm looking at the Douglas' book "PostgreSQL" on page 804 where it claims the operation does not require exclusive
access.
VACUUM FULL (VACUUM FULL ANALYZE) - will need to lock the table, but I'm thinking this may only take a couple of milliseconds as I think this builds a temporary table from the one
being vacuumed and then when complete, drops the old table and renames the temporary. So the lock should only be during this move, or does it lock the table for the entirety?
Is this correct? And how does it then resolve say another deletion in the table that occurs while the VACUUM is in operation? I hope that is clearly expressed.
--
Jay
--
Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin
On 04/03/2015 08:59 AM, John Scalia wrote: > Hi all, > > I have a question regarding vacuum and locking, but since I currently don't have access to my servers (maintenance activities),I can't verify on my own. I know that VACUUM has at > least four variants. OK, actually two with ability to add ANALYZE after the operation. Here we go: Start here: http://www.postgresql.org/docs/9.4/static/sql-vacuum.html > > VACUUM (VACUUM ANALYZE) - shouldn't need to lock the table(?). I'm looking at the Douglas' book "PostgreSQL" on page 804where it claims the operation does not require exclusive > access. True, but the increase in I/O *can* impact overall server operation. Note that the typical default install includes the autovacuum which will automatically run the vacuum process when necessary. Vacuuming is a critical part of maintenance for a number of reasons and not running vacuum is definitely discouraged. Vacuum essentially marks released space within the files containing the data as reusable. Vacuum does not free up disk space. > > VACUUM FULL (VACUUM FULL ANALYZE) - will need to lock the table, but I'm thinking this may only take a couple of millisecondsas I think this builds a temporary table from the one > being vacuumed and then when complete, drops the old table and renames the temporary. So the lock should only be duringthis move, or does it lock the table for the entirety? No, it locks for the duration which can potentially be a very long time as it is physically reshuffling the data to shrink the on-disk space usage. This is typically reserved for situations that cause excessive table bloat such as full-table updates or deletes of large portions of the table *and* when there is insufficient disk-space to use an alternative method of compacting. Look also at CLUSTER (http://www.postgresql.org/docs/9.4/static/sql-cluster.html). Unlike VACUUM FULL, CLUSTER *does* write the data to a new disk files but is far faster than VACUUM and can reorder the table data to match an index which can improve the performance of certain queries. CLUSTER requires sufficient available disk-space to write the new copy of the table and associate indexes which can be anywhere from the size of your original table/indexes if it is already packed to far less if the table is heavily bloated. N.B. The operation of CLUSTER and VACUUM have changed from version to version so read the docs for your version. For example, CLUSTER used to require an index and would reorder using that index. This requirement has been dropped in the latest version. CLUSTER also requires and exclusive lock so... Check out the pg_reorg extension. This extension does create a new table on the fly while tracking updates. It then applies any updates that have occurred while creating the new copy and finally swaps the new version in place of the old. It creates I/O but doesn't require the long-duration locks of the other methods. > > Is this correct? And how does it then resolve say another deletion in the table that occurs while the VACUUM is in operation?I hope that is clearly expressed. > Vacuum is just a maintenance operation. Changes to the table will be dealt with in the next VACUUM. Note, while you can run VACUUM ANALYZE together, VACUUM and ANALYZE do different things and can be run independently. While VACUUM deals with space usage and a few other issues, ANALYZE updates the table statistics used by the planner. Cheers, Steve
On 04/03/2015 09:34 AM, Steve Crawford wrote: > > Check out the pg_reorg extension. This extension does create a new > table on the fly while tracking updates. It then applies any updates > that have occurred while creating the new copy and finally swaps the > new version in place of the old. It creates I/O but doesn't require > the long-duration locks of the other methods. Actually I should have also said check out pg_repack, the fork of pg_reorg. In either case also do some reading on potential pitfalls: http://adpgtech.blogspot.com/2014/10/pgrepack-pitfalls.html Cheers, Steve
John Scalia <jayknowsunix@gmail.com> wrote: > VACUUM (VACUUM ANALYZE) - shouldn't need to lock the table(?). > I'm looking at the Douglas' book "PostgreSQL" on page 804 where > it claims the operation does not require exclusive access. Without the FULL keyword, it only requires a SHARE UPDATE EXCLUSIVE lock, which does not conflict with ordinary DML. http://www.postgresql.org/docs/9.2/interactive/explicit-locking.html#LOCKING-TABLES > VACUUM FULL (VACUUM FULL ANALYZE) - will need to lock the table, Yes, in this case with an ACCESS EXCLUSIVE lock, which conflicts with all other locks. > but I'm thinking this may only take a couple of milliseconds as I > think this builds a temporary table from the one > being vacuumed and then when complete, drops the old table and > renames the temporary. So the lock should only be during this > move, or does it lock the table for the entirety? If it only locked while after the data was scanned, how could it deal with DML affecting portions it had already scanned? If that were the only problem it could take a SHARE lock, which would allow concurrent reads, but then we would have the issue of escalating the lock to swap in the new heap and indexes -- and lock escalation carries a risk of deadlock. > And how does it then resolve say another deletion in the table > that occurs while the VACUUM is in operation? A VACUUM FULL could not cope with that, which is why it needs the ACCESS EXCLUSIVE lock. A non-FULL VACUUM is OK because of visibility checking and/or (depending on relation type) a special form of page locking used just for cleanups. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Apr 3, 2015, at 9:59 AM, John Scalia <jayknowsunix@gmail.com> wrote: > > So the lock should only be during this move, or does it lock the table for the entirety? The entirety. It doesn't know how to track changes that occur during the copying to the new table and apply them after, soit locks it the whole time. (Just went through this myself...) -- Scott Ribe scott_ribe@elevated-dev.com http://www.elevated-dev.com/ https://www.linkedin.com/in/scottribe/ (303) 722-0567 voice
<p dir="ltr">If you use the --no-table-lock flag from the contrib stuff, you can alleviate that...<br /><div class="cm_quote"style=" color: #787878">On Fri, Apr 03, 2015 at 3:27 PM, Scott Ribe <<a href="mailto:scott_ribe@elevated-dev.com">scott_ribe@elevated-dev.com</a>>wrote:</div><br /><div id="oldcontent" style="background-color:rgb(255, 255, 255); background-position: initial initial; background-repeat: initial initial;"><blockquotestyle=""><p dir="ltr">On Apr 3, 2015, at 9:59 AM, John Scalia <jayknowsunix@gmail.com> wrote:<br /> > <br /> > So the lock should only be during this move, or does it lock the table for the entirety? <br/><br /> The entirety. It doesn't know how to track changes that occur during the copying to the new table and apply themafter, so it locks it the whole time. (Just went through this myself...) <br /><br /> -- <br /> Scott Ribe <br /> scott_ribe@elevated-dev.com<br /> http://www.elevated-dev.com/ <br /> https://www.linkedin.com/in/scottribe/ <br /> (303)722-0567 voice <br /><br /><br /><br /><br /><br /><br /><br /> -- <br /> Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org)<br /> To make changes to your subscription: <br /> http://www.postgresql.org/mailpref/pgsql-admin<br /></blockquote></div>
On Fri, Apr 3, 2015 at 1:34 PM, Steve Crawford <scrawford@pinpointresearch.com> wrote:
Look also at CLUSTER (http://www.postgresql.org/docs/9.4/static/sql-cluster.html). Unlike VACUUM FULL, CLUSTER *does* write the data to a new disk files but is far faster than VACUUM and can reorder the table data to match an index which can improve the performance of certain queries. CLUSTER requires sufficient available disk-space to write the new copy of the table and associate indexes which can be anywhere from the size of your original table/indexes if it is already packed to far less if the table is heavily bloated.
N.B. The operation of CLUSTER and VACUUM have changed from version to version so read the docs for your version.
This information seems a little off. It was true that VACUUM FULL did not rewrite the table, but that has been changed since version 9.0 [1]:
"New implementation of VACUUM FULL. This command now rewrites the entire table and indexes, rather than moving individual rows to compact space. It is substantially faster in most cases, and no longer results in index bloat."
"New implementation of VACUUM FULL. This command now rewrites the entire table and indexes, rather than moving individual rows to compact space. It is substantially faster in most cases, and no longer results in index bloat."
So, it was common to use CLUSTER instead of VACUUM FULL before 9.0, but since then CLUSTER is needed only if you really want to enforce the the order.
For example, CLUSTER used to require an index and would reorder using that index. This requirement has been dropped in the latest version.
No, at least up to 9.4 (the latest version today), CLUSTER still requires an index, the command itself does not, but only in case that there is a previously clustered index for the table.
Regards,
--
Matheus de Oliveira
Analista de Banco de Dados
Dextra Sistemas - MPS.Br nível F!
www.dextra.com.br/postgres
Analista de Banco de Dados
Dextra Sistemas - MPS.Br nível F!
www.dextra.com.br/postgres