Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue
| От | Heikki Linnakangas | 
|---|---|
| Тема | Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue | 
| Дата | |
| Msg-id | 10f4af2c-611b-4640-9aba-097b07aa3c2b@iki.fi обсуждение исходный текст  | 
		
| Ответ на | Re: LISTEN/NOTIFY bug: VACUUM sets frozenxid past a xid in async queue ("Joel Jacobson" <joel@compiler.org>) | 
| Список | pgsql-hackers | 
On 29/10/2025 14:30, Joel Jacobson wrote: > On Wed, Oct 29, 2025, at 12:40, Heikki Linnakangas wrote: >> On 25/10/2025 21:01, Joel Jacobson wrote: >>> I also want to share a new idea that Heikki Linnakangas came up with >>> during PgConf25 in Riga. >>> >>> The idea is basically to scan the notification queue from tail to head, >>> and update xids that precedes newFrozenXid, to either >>> FrozenTransactionId if committed, or InvalidTransactionId if not. >>> >>> I've implemented this by adding a new extern function >>> AsyncNotifyFreezeXids to async.h, called from vac_update_datfrozenxid. >>> >>> This way, we don't need to hold back the advancement of newFrozenXid in >>> vac_update_datfrozenxid. >>> >>> Attempt of implementing Heikki's idea: >>> - async_notify_freeze_xids.txt >> >> Thanks! > > Thanks for the idea! For the record, Yura suggested the same approach earlier in this thread [1]. That line of discussion led to more complicated patches but I think the complications came from trying to set the flag as part of commit/abort. The changes are more straightforward and backpatchable if we only do it at vacuum. >>> This idea has the benefit of never holding back newFrozenXid, >>> however, I see some problems with it: >>> >>> - If there is a bug in the code, we could accidentally overwrite >>> xid values we didn't intend to overwrite, and maybe we would never >>> find out that we did. >>> >>> - We wouldn't know for sure that we've understood the cause of >>> this bug. >>> >>> With v12-vacuum_notify_queue_cleanup, we instead have the downside of >>> possibly holding back newFrozenXid, but with the advantage of giving us >>> higher confidence in that we've correctly identified the cause of the >>> bug. >> >> Meh. Robustness is good and all, and in heap tuples we don't overwrite >> the xmin/xmax but just set a FROZEN flag, precisely so that we preserve >> evidence if something goes wrong. But I can't get too worked up about >> that for the async notification queue. > > Having slept on this for some days, I'm less worried about this > approach. I like the simplicity of it, and that we don't bolt on complexity > to another subsystem, just for the sake of improved debugging > capabilities of a different subsystem. > > I think a different way of looking at this, is that we wouldn't conceal > a bug in async, but rather we would let vacuum do part of its job, of > checking the commit status of the xids, when needed, and be okay with > that split responsibility. Ok, at quick glance I think this patch is in pretty good shape. I'll review it more thoroughly, and see if I can incorporate the test from Matheus Alcantara's or Arseniy Mukhin's latest patches, and commit and backpatch this. - Heikki
В списке pgsql-hackers по дате отправления: