Обсуждение: pgsql: Add VACUUM instrumentation for scanned pages, relfrozenxid.
Add VACUUM instrumentation for scanned pages, relfrozenxid. Report on scanned pages within VACUUM VERBOSE and autovacuum logging. These are pages that were physically examined during the VACUUM operation. Note that this can include a small number of pages that were marked all-visible in the visibility map by some earlier VACUUM operation. VACUUM won't skip all-visible pages that aren't part of a range of all-visible pages that's at least 32 blocks in length (partly to avoid missing out on opportunities to advance relfrozenxid during non-aggressive VACUUMs). Commit 44fa8488 simplified the definition of scanned pages. It became the complement of the pages (of those pages from rel_pages) that were skipped using the visibility map. And so scanned pages precisely indicates how effective the visibility map was at saving work. (Before now we displayed the number of pages skipped via the visibility map when happened to be frozen pages, but not when they were merely all-visible, which was less useful to users.) Rename the user-visible OldestXmin output field to "removal cutoff", and show some supplementary information: how far behind the cutoff is (number of XIDs behind) by the time the VACUUM operation finished. This will help users to figure out what's _not_ working in extreme cases where VACUUM is fundamentally unable to remove dead tuples or freeze older tuples (e.g., due to a leaked replication slot). Also report when relfrozenxid is advanced by VACUUM in output that immediately follows "removal cutoff". This structure is intended to highlight the relationship between the new relfrozenxid value for the table, and the VACUUM operation's removal cutoff. Finally, add instrumentation of "missed dead tuples", and the number of pages that had at least one such tuple. These are fully DEAD (not just RECENTLY_DEAD) tuples with storage that could not be pruned due to failure to acquire a cleanup lock on a heap page. This is a replacement for the "skipped due to pin" instrumentation removed by commit 44fa8488. It shows more details than before for pages where failing to get a cleanup lock actually resulted in VACUUM missing out on useful work, but usually shows nothing at all instead (the mere fact that we couldn't get a cleanup lock is usually of no consequence whatsoever now). Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CAH2-Wznp=c=Opj8Z7RMR3G=ec3_JfGYMN_YvmCEjoPCHzWbx0g@mail.gmail.com Branch ------ master Details ------- https://git.postgresql.org/pg/commitdiff/872770fd6ccf12596b9e26234df9a37cae083af2 Modified Files -------------- src/backend/access/heap/vacuumlazy.c | 97 +++++++++++++++++++++++++----------- src/backend/commands/analyze.c | 3 ++ src/backend/commands/vacuum.c | 9 ++++ src/include/commands/vacuum.h | 2 + 4 files changed, 82 insertions(+), 29 deletions(-)
On 12.02.22 01:49, Peter Geoghegan wrote: > Add VACUUM instrumentation for scanned pages, relfrozenxid. This patch introduced the log messages "new relfrozenxid: %u, which is %d XIDs ahead of previous value\n" "new relminmxid: %u, which is %d MXIDs ahead of previous value\n" According to the surrounding math in each case, this means that the new %u is larger by %d than the previous value. The use of "ahead of" had confused me here, since I needed to think three times about which direction is meant. I suppose it's technically correct, but perhaps there is a clearer term?
On Wed, Sep 14, 2022 at 1:20 PM Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote: > According to the surrounding math in each case, this means that the new > %u is larger by %d than the previous value. > > The use of "ahead of" had confused me here, since I needed to think > three times about which direction is meant. I suppose it's technically > correct, but perhaps there is a clearer term? I don't know. There are several things to consider here. Or at least several things that I've considered, that seem important. These are: * We kinda need to "hide" the original relfrozenxid from the user (just show the new one) by just showing a delta. After all, we might needlessly confuse users if we ended up showing an original relfrozenxid that has been advanced to a more recent value that is ostensibly (though not actually) far older than the original. Most users won't quite understand that XIDs use modulo-2^32 arithmetic. It isn't actually important that they understand the details, so I really would rather not go there. * The delta is very useful for comparing successive VACUUM operations for the same table (usually a very large table) over time. The dimension of time is very important -- how things change for the table (this is true for everything, not just relfrozenxid stuff). The XIDs themselves aren't terribly meaningful over time (they get recycled), but the XID-age-wise deltas really can be compared to each other easily and sensibly. The variance of the "advanced by this many XIDs" delta over time might be very interesting. I expect that this will become much more important in Postgres 16. It's much easier for VACUUM to find a way to keep up over time if it is given breathing room to freeze when it's easiest/cheapest. We'll still have to insist on keeping relfrozenxid recent at some point, but doing so prematurely seems very counterproductive to me. * The delta thing also highlights the relationship between "removable cutoff" and its own XID age at the end of the VACUUM, which is itself a delta. The maximum final relfrozenxid that can ever be set by VACUUM is also "removable cutoff". ISTM that there is value in hinting at that relationship through how we present the information. It's something that advanced users have a good chance of picking up on if they see lots of examples of. -- Peter Geoghegan