Re: Does the work made by vaccum in the current pass is lost when interrupted?

Поиск

Список

Период

Сортировка

От	Allan Kamau
Тема	Re: Does the work made by vaccum in the current pass is lost when interrupted?
Дата	9 октября 2020 г. 11:31:43
Msg-id	CAF3N6oSuBs58xme8pjGpr=+TqNEB_QTUHW2pNv4n9KX+AbXwMQ@mail.gmail.com обсуждение исходный текст
Ответ на	Does the work made by vaccum in the current pass is lost when interrupted? (Jean-Marc Lessard <Jean-Marc.Lessard@ultra-ft.com>)
Список	pgsql-general

Дерево обсуждения

On Thu, Oct 8, 2020 at 10:16 PM Jean-Marc Lessard <Jean-Marc.Lessard@ultra-ft.com> wrote:

I have a large table (billions of records) which has not been vacuum and bloated.
Vacuum scale factor was left at the default.

I ran a vacuum on a DEV system and it makes several passes (scanning heap/vacuuming indexes/vacumming heap) which take more than an hour each.
On a PROD system, I may have to kill the job midway.
Should I reduce the autovacuum_work_mem of my session? Currently 1GB
Does the work made in the current pass is lost when interrupted?
Correct me if I am not right, but vacuumed indexes and heap made by the current pass will go to disk, but the heap should be rescan on the next vacuum.
I guess that the visibility map is updated only at the end of each pass.

My best regards

Thank you

Jean-Marc Lessard
Administrateur de base de données / Database Administrator

Ultra Electronics Forensic Technology Inc.
T +1 514 489 4247 x4164
www.ultra-forensictechnology.com

A suggestion, considering that you have a table with billions of tuples, you could look into table partitioning, see "https://www.postgresql.org/docs/current/ddl-partitioning.html".

First you may need to identify a criteria by which to partition your data, maybe by record population timestamp binned into yyyymm resolution.

One a test environment, you may construct a table similar to the one currently having the many records but with a different name.

Then write code which will construct the child tables, run the sql to construct the child tables.

If your child tables are based on the record date, you may construct child tables for the timestamps of the record you have as well as several months or timepoints in the future. Maybe have a cron job to construct new tables of future timepoints.

Then write code to populate the new tables directly in piecemeal by the use of WHERE clause with data from the current production table.

You may choose to write the above code to use "COPY" or "INSERT" to populate the specific partition table.

Clone your current application and modify the code such that it inserts directly to the specific child table or leave the writing to the specific child table to be done by the on insert trigger.

Test the data population via the application to see if the child tables are being populated accordingly.

If all is well. Schedule downtime where you can implement these changes to your production environment.

Allan.

Вложения

image001.jpg

В списке pgsql-general по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Does the work made by vaccum in the current pass is lost when interrupted?

Вложения