Re: Does the work made by vaccum in the current pass is lost when interrupted?

Поиск
Список
Период
Сортировка
От Allan Kamau
Тема Re: Does the work made by vaccum in the current pass is lost when interrupted?
Дата
Msg-id CAF3N6oSuBs58xme8pjGpr=+TqNEB_QTUHW2pNv4n9KX+AbXwMQ@mail.gmail.com
обсуждение исходный текст
Ответ на Does the work made by vaccum in the current pass is lost when interrupted?  (Jean-Marc Lessard <Jean-Marc.Lessard@ultra-ft.com>)
Список pgsql-general


On Thu, Oct 8, 2020 at 10:16 PM Jean-Marc Lessard <Jean-Marc.Lessard@ultra-ft.com> wrote:

I have a large table (billions of records) which has not been vacuum and bloated.

Vacuum scale factor was left at the default.

 

I ran a vacuum on a DEV system and it makes several passes (scanning heap/vacuuming indexes/vacumming heap) which take more than an hour each.

On a PROD system, I may have to kill the job midway.

Should I reduce the autovacuum_work_mem of my session? Currently 1GB

Does the work made in the current pass is lost when interrupted?

Correct me if I am not right, but vacuumed indexes and heap made by the current pass will go to disk, but the heap should be rescan on the next vacuum.

I guess that the visibility map is updated only at the end of each pass.

 

My best regards

 

Thank you

 


Jean-Marc Lessard
Administrateur de base de données / Database Administrator



Ultra Electronics Forensic Technology Inc.
T +1 514 489 4247 x4164
www.ultra-forensictechnology.com

 





A suggestion, considering that you have a table with billions of tuples, you could look into table partitioning, see "https://www.postgresql.org/docs/current/ddl-partitioning.html".

First you may need to identify a criteria by which to partition your data, maybe by record population timestamp binned into yyyymm resolution.


One a test environment, you may construct a table similar to the one currently having the many records but with a different name.

Then write code which will construct the child tables, run the sql to construct the child tables.
If your child tables are based on the record date, you may construct child tables for the timestamps of the record you have as well as several months or timepoints in the future. Maybe have a cron job to construct new tables of future timepoints.

Then write code to populate the new tables directly in piecemeal by the use of WHERE clause with data from the current production table.

You may choose to write the above code to use "COPY" or "INSERT" to populate the specific partition table.


Clone your current application and modify the code such that it inserts directly to the specific child table or leave the writing to the specific child table to be done by the on insert trigger. 

Test the data population via the application to see if the child tables are being populated accordingly.

If all is well. Schedule downtime where you can implement these changes to your production environment.



Allan.

 
Вложения

В списке pgsql-general по дате отправления:

Предыдущее
От: Tony Shelver
Дата:
Сообщение: Re: What's your experience with using Postgres in IoT-contexts?
Следующее
От: Michael Lewis
Дата:
Сообщение: Re: Does the work made by vaccum in the current pass is lost when interrupted?