Обсуждение: vacuumlo
Hi All,
I need to run the vacuumlo command against our production database.
Being a PostgresQL database utility, it should be 100% safe to run and should not delete/drop active data.
I have run it in our QA environment with success, but now they are having a few application issues and I have told them that the issues cannot be related to the vacuumlo utility as it is a PostgresQL utility.
Has anyone had issues before running this utility?
Regards
Ian
Disclaimer
The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.
This email has been scanned for viruses and malware, and may have been automatically archived by Mimecast, a leader in email security and cyber resilience. Mimecast integrates email defenses with brand protection, security awareness training, web security, compliance and other essential capabilities. Mimecast helps protect large and small organizations from malicious activity, human error and technology failure; and to lead the movement toward building a more resilient world. To find out more, visit our website.
Hi, On Tue, Aug 17, 2021 at 7:52 PM Ian Dauncey <Ian.Dauncey@bankzero.co.za> wrote: > > I need to run the vacuumlo command against our production database. > > Being a PostgresQL database utility, it should be 100% safe to run and should not delete/drop active data. It's safe as long as you're aware of what this tool is doing. As mentioned in https://www.postgresql.org/docs/current/vacuumlo.html: > vacuumlo is a simple utility program that will remove any “orphaned” large objects from a PostgreSQL database. An orphanedlarge object (LO) is considered to be any LO whose OID does not appear in any oid or lo data column of the database. So: > I have run it in our QA environment with success, but now they are having a few application issues and I have told themthat the issues cannot be related to the vacuumlo utility as it is a PostgresQL utility. The most likely explanation is that your database somehow has large object that are not referenced in an "oid" or "lo" column. If that's the case, vacuumlo will delete some of your data, as you didn't you your part of the contract required to use that tool, which is to properly reference large objects reference.
Hi,
Just an update on my vacuumlo issue.
I did run the vacuumlo against the pg_largeobject table without any issues but afterwards I ran a vacuum full against this table which caused lots of issues.
Because the vacuum full takes an exclusive lock (which was my first mistake as I did not stop the applications accessing the database) on the table I had all the applications hanging. The next issue was it started writing out WAL logs and in the end the file system which housed the Wal logs filled up causing the vacuum to fail.
Now the issue I have here is that the vacuum full created a temporary table , and when it crashed this temporary table did not get deleted. I did rerun the vacuum full against the pg_largeobject table (and yes, I did stop all the applications first). It did complete successfully but it did not drop the previous temporary table. This table is taking close to 100 Gig of disk space.
If I backup and restore the database onto a different server this temporary table does not get restored.
My question here is.
- How do I get rid of this temporary table without a backup and restore as this is our Prod system?
- Is there a way of finding out the name of this temp table and matching it up to files on disk?
Any help will be appreciated
Regards
Ian.
From: Julien Rouhaud <rjuju123@gmail.com>
Sent: Tuesday, 17 August 2021 14:18
To: Ian Dauncey <Ian.Dauncey@bankzero.co.za>
Cc: pgsql-admin@lists.postgresql.org
Subject: Re: vacuumlo
External email - treat with caution |
Hi,
On Tue, Aug 17, 2021 at 7:52 PM Ian Dauncey <Ian.Dauncey@bankzero.co.za> wrote:
>
> I need to run the vacuumlo command against our production database.
>
> Being a PostgresQL database utility, it should be 100% safe to run and should not delete/drop active data.
It's safe as long as you're aware of what this tool is doing. As
mentioned in https://www.postgresql.org/docs/current/vacuumlo.html
> vacuumlo is a simple utility program that will remove any “orphaned” large objects from a PostgreSQL database. An orphaned large object (LO) is considered to be any LO whose OID does not appear in any oid or lo data column of the database.
So:
> I have run it in our QA environment with success, but now they are having a few application issues and I have told them that the issues cannot be related to the vacuumlo utility as it is a PostgresQL utility.
The most likely explanation is that your database somehow has large
object that are not referenced in an "oid" or "lo" column. If that's
the case, vacuumlo will delete some of your data, as you didn't you
your part of the contract required to use that tool, which is to
properly reference large objects reference.
Disclaimer
The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.
This email has been scanned for viruses and malware, and may have been automatically archived by Mimecast, a leader in email security and cyber resilience. Mimecast integrates email defenses with brand protection, security awareness training, web security, compliance and other essential capabilities. Mimecast helps protect large and small organizations from malicious activity, human error and technology failure; and to lead the movement toward building a more resilient world. To find out more, visit our website.
On Mon, 2021-08-30 at 15:08 +0000, Ian Dauncey wrote: > I did run the vacuumlo against the pg_largeobject table without any issues but afterwards > I ran a vacuum full against this table which caused lots of issues. > Because the vacuum full takes an exclusive lock (which was my first mistake as I > did not stop the applications accessing the database) on the table I had all the > applications hanging. Right. > The next issue was it started writing out WAL logs and in the end the file system > which housed the Wal logs filled up causing the vacuum to fail. > Now the issue I have here is that the vacuum full created a temporary table , and when > it crashed this temporary table did not get deleted. I did rerun the vacuum full against > the pg_largeobject table (and yes, I did stop all the applications first). > It did complete successfully but it did not drop the previous temporary table. > This table is taking close to 100 Gig of disk space. VACUUM (FULL) will write a compact copy of the table, then delete the old one, so it temporarily needs more space. It may be a good idea to put "pg_wal" into a different file system, so that WAL space cannot run out even if the data files fill the file system. Then PostgreSQL won't crash, and the files will be removed. > If I backup and restore the database onto a different server this temporary table does not get restored. > My question here is. > 1. How do I get rid of this temporary table without a backup and restore as this is our Prod system? You cannot :^( If you know PostgreSQL well, and you know what you are doing, you might be able to identify those files and manually delete them. But that's definitely not something for the casual user: delete the wrong file, and you have to restore the backup. > 2. Is there a way of finding out the name of this temp table and matching it up to files on disk? No, unfortunately not. You'd have to identify the files that do *not* belong to any table, and that is tricky. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
That can be verified by checking RELFILENODE of the table. VACUUM FULL changes RELFILENODE:
mgogala=# select relfilenode from pg_class where relname='emp';
relfilenode
-------------
52178
(1 row)
mgogala=# vacuum full emp;
VACUUM
mgogala=# select relfilenode from pg_class where relname='emp';
relfilenode
-------------
52182
(1 row)
According to unsubstantiated rumors from https://www.postgresql.org/docs/13/catalog-pg-class.html, the definition of RELFILENODE column is as follows:
relfilenode oid
Name of the on-disk file of this relation; zero means this is a “mapped” relation whose disk file name is determined by low-level state
Change of the RELFILENODE means that the relation has got a new file. And that means that vacuum full rewrites the table, just as Laurenz has said. Note that OID itself is NOT changed.
VACUUM (FULL) will write a compact copy of the table, then delete the old one, so it temporarily needs more space. It may be a good idea to put "pg_wal" into a different file system, so that WAL space cannot run out even if the data files fill the file system. Then PostgreSQL won't crash, and the files will be removed.
-- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
On Mon, 2021-08-30 at 13:38 -0400, Mladen Gogala wrote: > According to unsubstantiated rumors from https://www.postgresql.org/docs/13/catalog-pg-class.html, > the definition of RELFILENODE column is as follows: > > relfilenode oid > > Name of the on-disk file of this relation; zero means this is a “mapped” relation whose disk > file name is determined by low-level state > Change of the RELFILENODE means that the relation has got a new file. And that means that vacuum full > rewrites the table, just as Laurenz has said. Note that OID itself is NOT changed. Please don't top-post on these lists. I am not sure what you mean with "unsubstantiated". Note that this will be 0 for certain crucial system tables (essentially those that have to be accessed *before* regular metadata queries can be executed). So it is better to use the "pg_relation_filenode()" function to find the file that belongs to a certain table. You can use that to find which files do *not* belong to any table, index, sequence or materialized view and then delete those. But I would think twice before manually deleting files in the data directory. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
On Monday, August 30, 2021, Ian Dauncey <Ian.Dauncey@bankzero.co.za> wrote:
Hi,
Just an update on my vacuumlo issue.
I did run the vacuumlo against the pg_largeobject table without any issues but afterwards I ran a vacuum full against this table which caused lots of issues.
Because the vacuum full takes an exclusive lock (which was my first mistake as I did not stop the applications accessing the database) on the table I had all the applications hanging. The next issue was it started writing out WAL logs and in the end the file system which housed the Wal logs filled up causing the vacuum to fail.
Now the issue I have here is that the vacuum full created a temporary table , and when it crashed this temporary table did not get deleted. I did rerun the vacuum full against the pg_largeobject table (and yes, I did stop all the applications first). It did complete successfully but it did not drop the previous temporary table. This table is taking close to 100 Gig of disk space.
If I backup and restore the database onto a different server this temporary table does not get restored.
My question here is.
- How do I get rid of this temporary table without a backup and restore as this is our Prod system?
- Is there a way of finding out the name of this temp table and matching it up to files on disk?
Any help will be appreciated
Regards
Ian.
From: Julien Rouhaud <rjuju123@gmail.com>
Sent: Tuesday, 17 August 2021 14:18
To: Ian Dauncey <Ian.Dauncey@bankzero.co.za>
Cc: pgsql-admin@lists.postgresql.org
Subject: Re: vacuumlo
External email - treat with caution
Hi,
On Tue, Aug 17, 2021 at 7:52 PM Ian Dauncey <Ian.Dauncey@bankzero.co.za> wrote:
>
> I need to run the vacuumlo command against our production database.
>
> Being a PostgresQL database utility, it should be 100% safe to run and should not delete/drop active data.
It's safe as long as you're aware of what this tool is doing. As
mentioned in https://www.postgresql.org/docs/current/vacuumlo.html
> vacuumlo is a simple utility program that will remove any “orphaned” large objects from a PostgreSQL database. An orphaned large object (LO) is considered to be any LO whose OID does not appear in any oid or lo data column of the database.
So:
> I have run it in our QA environment with success, but now they are having a few application issues and I have told them that the issues cannot be related to the vacuumlo utility as it is a PostgresQL utility.
The most likely explanation is that your database somehow has large
object that are not referenced in an "oid" or "lo" column. If that's
the case, vacuumlo will delete some of your data, as you didn't you
your part of the contract required to use that tool, which is to
properly reference large objects reference.Disclaimer
The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.
This email has been scanned for viruses and malware, and may have been automatically archived by Mimecast, a leader in email security and cyber resilience. Mimecast integrates email defenses with brand protection, security awareness training, web security, compliance and other essential capabilities. Mimecast helps protect large and small organizations from malicious activity, human error and technology failure; and to lead the movement toward building a more resilient world. To find out more, visit our website.
On Tue, 2021-08-31 at 01:11 +0530, Atul Kumar wrote: > You may restart the postgres services. That temp file will be deleted automatically then. No - orphaned files left behind from a crash will not be removed during a restart. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
On Tuesday, August 31, 2021, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Tue, 2021-08-31 at 01:11 +0530, Atul Kumar wrote:
> You may restart the postgres services. That temp file will be deleted automatically then.
No - orphaned files left behind from a crash will not be removed during a restart.
Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com
On Tue, 2021-08-31 at 11:15 +0530, Atul Kumar wrote: > On Tuesday, August 31, 2021, Laurenz Albe <laurenz.albe@cybertec.at> wrote: > > > On Tue, 2021-08-31 at 01:11 +0530, Atul Kumar wrote: > > You may restart the postgres services. That temp file will be deleted automatically then. > > > > No - orphaned files left behind from a crash will not be removed during a restart. > > Ok, but after restart it is safe to remove such files. It is always save to remove those files. The problem is to identify them. Remove the wrong file, and your cluster is toast. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
On Tuesday, August 31, 2021, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Tue, 2021-08-31 at 11:15 +0530, Atul Kumar wrote:
> On Tuesday, August 31, 2021, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
>
> > On Tue, 2021-08-31 at 01:11 +0530, Atul Kumar wrote:
> > You may restart the postgres services. That temp file will be deleted automatically then.
> >
> > No - orphaned files left behind from a crash will not be removed during a restart.
>
> Ok, but after restart it is safe to remove such files.
It is always save to remove those files.
The problem is to identify them.
Remove the wrong file, and your cluster is toast.
Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com
Hi,Don’t you think it as a drawback of postgres that such temporary files don’t get deleted automatically once the transaction gets committed or rolled back .Could you suggest a good permanent solution for such case.RegardsAtul
On Tuesday, August 31, 2021, Laurenz Albe <laurenz.albe@cybertec.at> wrote:On Tue, 2021-08-31 at 11:15 +0530, Atul Kumar wrote:
> On Tuesday, August 31, 2021, Laurenz Albe <laurenz.albe@cybertec.at> wrote:
>
> > On Tue, 2021-08-31 at 01:11 +0530, Atul Kumar wrote:
> > You may restart the postgres services. That temp file will be deleted automatically then.
> >
> > No - orphaned files left behind from a crash will not be removed during a restart.
>
> Ok, but after restart it is safe to remove such files.
It is always save to remove those files.
The problem is to identify them.
Remove the wrong file, and your cluster is toast.
Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com
Hi,Don’t you think it as a drawback of postgres that such temporary files don’t get deleted automatically once the transaction gets committed or rolled back .
Could you suggest a good permanent solution for such case.
--
Morning.
Thanks for all the replies.
What I did to remove these files was to backup of the DB, drop the DB and then I restored the DB.
Regards
Ian
From: Guillaume Lelarge <guillaume@lelarge.info>
Sent: Thursday, 02 September 2021 09:33
To: Atul Kumar <akumar14871@gmail.com>
Cc: Laurenz Albe <laurenz.albe@cybertec.at>; Ian Dauncey <Ian.Dauncey@bankzero.co.za>; pgsql-admin@lists.postgresql.org
Subject: Re: vacuumlo
External email - treat with caution |
Le jeu. 2 sept. 2021 à 07:11, Atul Kumar <akumar14871@gmail.com> a écrit :
Hi,
Don’t you think it as a drawback of postgres that such temporary files don’t get deleted automatically once the transaction gets committed or rolled back .
They are automatically removed in case of a proper commit or rollback. They aren't in case of a crash.
Could you suggest a good permanent solution for such case.
You should try https://github.com/bdrouvot/pg_orphaned to find the list of orphaned files. It does a pretty good job at it.
--
Guillaume.
Disclaimer
The information contained in this communication from the sender is confidential. It is intended solely for use by the recipient and others authorized to receive it. If you are not the recipient, you are hereby notified that any disclosure, copying, distribution or taking action in relation of the contents of this information is strictly prohibited and may be unlawful.
This email has been scanned for viruses and malware, and may have been automatically archived by Mimecast, a leader in email security and cyber resilience. Mimecast integrates email defenses with brand protection, security awareness training, web security, compliance and other essential capabilities. Mimecast helps protect large and small organizations from malicious activity, human error and technology failure; and to lead the movement toward building a more resilient world. To find out more, visit our website.