Re: Long-running and non-finishing VACUUM ANALYZE on large table
От | Jan |
---|---|
Тема | Re: Long-running and non-finishing VACUUM ANALYZE on large table |
Дата | |
Msg-id | 5614649D.6050906@j.mk-contact.de обсуждение исходный текст |
Ответ на | Re: Long-running and non-finishing VACUUM ANALYZE on large table (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-admin |
On 10/07/2015 01:53 AM, Tom Lane wrote: > Jan <pgsql.admin@j.mk-contact.de> writes: >> I'm still not getting the math behind it. The below tuple stats show a >> dead tuple count of 63,187,655 whereas the PGadmin output (see my >> initial e-mail) reported the message "scanned index protein_hsps_pkey to >> remove 178956753 row versions" nine times before I cancelled it. That >> is, if one multiplies 178,956,753 by 9 it yields 1,610,610,777 (dead) >> rows. But the latter number is much higher than the above 63m rows? Do I >> compare the wrong numbers? > There's something awfully wacky about that. I suspect that pgstattuple > is somehow giving wrong answers, but I don't see any plausible theory > as to why. > >> Some more background: the whole database was recently migrated to a new >> database server and thus restored from a dump file. That is, the table >> 'protein_hsps' and its index were build from scratch. Since then, the >> only operations on that table were some column type conversions (e.g., >> integer to smallint, double to real). Data-wise, the only operations >> were UPDATES on a single boolean column by adding precalculated values >> (true/false) to EACH row in the database (~ 16bn rows). These UPDATEs >> were most likely the cause for the (huge) number of dead tuples (0.32%, >> see above), weren't they? > Such an UPDATE should have left the table 50% dead tuples, since every > row would leave behind a dead version. On the other hand, an ALTER > COLUMN TYPE operation should rewrite the whole table and leave no dead > tuples behind. No matter which one you did last, it doesn't square with > 0.32% dead tuples. > > My best guess at this point is that what you did last is an UPDATE, > so you have 50% dead tuples, and for some reason pgstattuple is not > telling you the truth about that. But the VACUUM is showing reality. Exactly, I altered the column types first. (That had been on my list for a long time, and I used the planned database server downtime for these optimizations.) After that, back in production mode, the aforementioned UPDATEs were applied (not at once, but in batches of rows). That is, I also assume 50% dead tuples right now. > How long did those UPDATEs and ALTER TABLEs take? If an ALTER seemed > tolerable then maybe what you want to do is VACUUM FULL, which would > be roughly the same cost. I protocolled the exact execution times of each ALTER statement (unfortunately not available right now because I'm at home) and these took roundabout 1-2 days each. Now I will go for the VACUUM FULL and will report back here once it's done. *fingers crossed* Many thanks for your time! Kind regards, Jan
В списке pgsql-admin по дате отправления: