Обсуждение: vacuum_defer_cleanup_age inconsistently applied on replicas
Hi, GetOldestXmin() applies vacuum_defer_cleanup_age only when !RecoveryInProgress(). In contrast to that GetSnapshotData() applies it unconditionally. I'm not actually clear whether including vacuum_defer_cleanup_age on a replica is meaningful. But it strikes me as odd to have that behavioural difference between GetOldestXmin() and GetSnapshotData() - without any need, as far as I can tell? The difference seems to have been introduced in commit bca8b7f16a3e720794cb0afbdb3733be4f8d9c2c Author: Simon Riggs <simon@2ndQuadrant.com> Date: 2011-02-16 19:29:37 +0000 Hot Standby feedback for avoidance of cleanup conflicts on standby. Standby optionally sends back information about oldestXmin of queries which is then checked and applied to the WALSender's proc->xmin. GetOldestXmin() is modified slightly to agree with GetSnapshotData(), so that all backends on primary include WALSender within their snapshots. Note this does nothing to change the snapshot xmin on either master or standby. Feedback piggybacks on the standby reply message. vacuum_defer_cleanup_age is no longer used on standby, though parameter still exists on primary, since some use cases still exist. Simon Riggs, review comments from Fujii Masao, Heikki Linnakangas, Robert Haas without, as far as I can tell, explaining why "vacuum_defer_cleanup_age is no longer used on standby" shouldn't also apply to GetSnapshotData(). I suspect it doesn't hurt all that much to unnecessarily apply vacuum_defer_cleanup_age on a replica. The only thing I see where it matters is that it makes get_actual_variable_endpoint() less accurate, which we probably would like to avoid... Greetings, Andres Freund
On Fri, Apr 3, 2020 at 3:53 PM Andres Freund <andres@anarazel.de> wrote: > GetOldestXmin() applies vacuum_defer_cleanup_age only when > !RecoveryInProgress(). In contrast to that GetSnapshotData() applies it > unconditionally. > > I'm not actually clear whether including vacuum_defer_cleanup_age on a > replica is meaningful. But it strikes me as odd to have that behavioural > difference between GetOldestXmin() and GetSnapshotData() - without any > need, as far as I can tell? Did you notice the comments added by Tom in b4a0223d008, which repeat the claim that it isn't used on standbys? I think that this is probably just an oversight in bca8b7f1, as you suggested. It's not that hard to imagine how this oversight might have happened: Hot standby feedback was introduced, and nobody cared about vacuum_defer_cleanup_age anymore. It was always very difficult to tune. OTOH, I wonder if it's possible that vacuum_defer_cleanup_age was deliberately intended to affect the behavior of XLogWalRcvSendHSFeedback(), which is probably one of the most common reasons why GetOldestXmin() is called on standbys. -- Peter Geoghegan
On Fri, Apr 3, 2020 at 4:18 PM Peter Geoghegan <pg@bowt.ie> wrote: > OTOH, I wonder if it's possible that vacuum_defer_cleanup_age was > deliberately intended to affect the behavior of > XLogWalRcvSendHSFeedback(), which is probably one of the most common > reasons why GetOldestXmin() is called on standbys. Pressed "send" too soon. vacuum_defer_cleanup_age *doesn't* get applied when recovery is in progress, so that definitely can't be true. Another hint that vacuum_defer_cleanup_age is only really supposed to be used on the primary is the fact that it appears under "18.6.1. Master Server" in the 9.1 docs. -- Peter Geoghegan