Обсуждение: Introduce XID age based replication slot invalidation
Hi folks, I'd like to restart the discussion about providing an xid-based slot invalidation mechanism. The previous effort [1] presented an XID and time-based invalidation and the inactive time-based approach was implemented first. The latest XID based patch from Bharath Rupireddy can be found here [2]. When thinking about availability of the database, inactive replication slots cause two main pain points: 1) WAL accumulation 2) Replication slots with xmin/catalog_xmin can hold back vacuuming leading to wrap-around The first issue can be mitigated by 'max_slot_wal_keep_size'. However in the second case there are no good mechanisms to prioritize write availability of the database and avoid wraparound. The new GUC 'idle_replication_slot_timeout' partially addresses the concern if you have similar workloads. However it's hard to set the same setting across a fleet of different applications. It's easy to imagine a high-XID churning workload in one cluster while another has large batch jobs where changes get synced out periodically. There isn't a "one-size" fits all setting for 'idle_replication_slot_timeout' in these two cases. The attached patch addresses this by introducing 'max_slot_xid_age' in a similar fashion. Replication slots with transaction ID greater than the set age will get invalidated allowing vacuum to proceed, biasing towards database availability. Invalidation happens in CHECKPOINT, similar to 'idle_replication_slot_timeout', and when VACUUM occurs. The patch currently attempts to invalidate once-per-autovacuum worker. We're wondering if it should attempt invalidation on a per-relation basis within the vacuum call itself. That would account for scenarios where the cost_delay or naptime is high between autovac executions. Thanks, John H [1] https://www.postgresql.org/message-id/flat/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe%2Baw%40mail.gmail.com [2] https://www.postgresql.org/message-id/flat/CALj2ACXe8%2BxSNdMXTMaSRWUwX7v61Ad4iddUwnn%3DdjSwx3GLLg%40mail.gmail.com -- John Hsu - Amazon Web Services
Вложения
Dear John, > The first issue can be mitigated by 'max_slot_wal_keep_size'. However > in the second case there are no good mechanisms to prioritize write > availability of the database and avoid wraparound. The new GUC > 'idle_replication_slot_timeout' partially addresses the concern if you > have similar workloads. However it's hard to set the same setting > across a fleet of different applications. IIUC, the feature can directly avoid the wraparound issue than other invalidation mechanism. The motivation seems enough for me. > The patch currently attempts to invalidate once-per-autovacuum worker. > We're wondering if it should attempt invalidation on a per-relation > basis within the vacuum call itself. That would account for scenarios > where the cost_delay or naptime is high between autovac executions. I have a concern that age calculation acquire the lock for XidGenLock thus performance can be affected. Do you have insights for it? > > Invalidation happens in CHECKPOINT, similar to > 'idle_replication_slot_timeout', and when VACUUM occurs. Let me confirm because I'm new. VACUUM can also trigger because old XID make VACUUM fail, right? Timeout is aimed for WAL thus it is not so related with VACUUM, which does not recycle segments. In contrast, is there a possibility that XID-age check can be done only at VACUUM? Regarding the patch, try_replication_slot_invalidation() and ReplicationSlotIsXIDAged() do the same task. Can we reduce duplicated part? Best regards, Hayato Kuroda FUJITSU LIMITED
Hi Hayato, Thank you for taking a look. > > The patch currently attempts to invalidate once-per-autovacuum worker. > > We're wondering if it should attempt invalidation on a per-relation > > basis within the vacuum call itself. That would account for scenarios > > where the cost_delay or naptime is high between autovac executions. > > I have a concern that age calculation acquire the lock for XidGenLock thus > performance can be affected. Do you have insights for it? Are you concerned if we did the check on a per table case? Or in the current situation where it's only once per-worker. > > > > Invalidation happens in CHECKPOINT, similar to > > 'idle_replication_slot_timeout', and when VACUUM occurs. > > Let me confirm because I'm new. VACUUM can also trigger because old XID make > VACUUM fail, right? Timeout is aimed for WAL thus it is not so related with VACUUM, > which does not recycle segments. > I feel that the timeout is used as a way to roughly address storage accumulation or VACUUM not progressing due to slots. > In contrast, is there a possibility that XID-age check can be done only at VACUUM? It's also done in CHECKPOINT because there can be stale replication slots on standby that aren't there on writer. We would still want them to be invalidated. > Regarding the patch, try_replication_slot_invalidation() and ReplicationSlotIsXIDAged() > do the same task. Can we reduce duplicated part? Thanks for catching, I thought I did this but guess not. Updated in the latest attachment. -- John Hsu - Amazon Web Services
Вложения
Hi, On Thu, Sep 18, 2025 at 10:20 AM John H <johnhyvr@gmail.com> wrote: > > I'd like to restart the discussion about providing an xid-based slot > invalidation mechanism. The previous effort [1] presented an XID and > time-based invalidation and the inactive time-based approach was > implemented first. The latest XID based patch from Bharath Rupireddy > can be found here [2]. > > When thinking about availability of the database, inactive replication > slots cause two main pain points: > 1) WAL accumulation > 2) Replication slots with xmin/catalog_xmin can hold back vacuuming > leading to wrap-around > > It's easy to imagine a high-XID churning workload in one cluster while > another has large batch jobs where changes get synced out > periodically. There isn't a "one-size" fits all setting for > 'idle_replication_slot_timeout' in these two cases. +1. > The attached patch addresses this by introducing 'max_slot_xid_age' in > a similar fashion. Replication slots with transaction ID greater than > the set age will get invalidated allowing vacuum to proceed, biasing > towards database availability. > > Invalidation happens in CHECKPOINT, similar to > 'idle_replication_slot_timeout', and when VACUUM occurs. > > The patch currently attempts to invalidate once-per-autovacuum worker. > We're wondering if it should attempt invalidation on a per-relation > basis within the vacuum call itself. That would account for scenarios > where the cost_delay or naptime is high between autovac executions. IMO, computing XID horizons per-relation during vacuum is good. The main reason we try to invalidate replication slots based on the XID age in the vacuum path is to help the database when it needs it most - when vacuum is computing the XID horizons. That said, it would be good to have performance analysis with a large number of replication slots, comparing once-per-relation vs. once-per-autovacuum worker vs. once-per-autovacuum launcher wake-up cycle. I haven't looked at the patch in depth, but it would be good to have a TAP test with more realistic production workloads. We could set this value to less than 1.5 billion and use xid_wraparound test to quickly reach the wraparound limits, then verify if this setting can help prevent the database from reaching wraparound errors. This approach would also validate the age calculations in try_replication_slot_invalidation with higher limits. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com