Commit b07642dbcd ("Trigger autovacuum based on number of INSERTs")
taught autovacuum to run in response to INSERTs, which is now
typically the dominant factor that drives vacuuming for an append-only
table -- a very useful feature, certainly.
This is driven by the following logic from autovacuum.c:
vacinsthresh = (float4) vac_ins_base_thresh +
vac_ins_scale_factor * reltuples;
...
/* Determine if this table needs vacuum or analyze. */
*dovacuum = force_vacuum || (vactuples > vacthresh) ||
(vac_ins_base_thresh >= 0 && instuples > vacinsthresh);
I can see why the nearby, similar vacthresh and anlthresh variables
(not shown here) are scaled based on pg_class.reltuples -- that makes
sense. But why should we follow that example here, with vacinsthresh?
Both VACUUM and ANALYZE update pg_class.reltuples. But this code seems
to assume that it's only something that VACUUM can ever do. Why
wouldn't we expect a plain ANALYZE to have actually been the last
thing to update pg_class.reltuples for an append-only table? Wouldn't
that lead to less frequent (perhaps infinitely less frequent)
vacuuming for an append-only table, relative to the documented
behavior of autovacuum_vacuum_insert_scale_factor?
--
Peter Geoghegan