On Wed, Nov 9, 2022 at 5:46 PM Andres Freund <andres@anarazel.de> wrote:
> > Putting all 3 together: doesn't it seem quite likely that the way that
> > we compute OldestXmin is the factor that prevents "skewering" of an
> > update chain? What else could possibly be preventing corruption here?
> > (Theoretically it might never have been discovered, but that seems
> > pretty hard to believe.)
>
> I don't see how that follows. The existing code is just ok with that.
My remarks about "3 facts we agree on" were not intended to be a
watertight argument. More like: what else could it possibly be that
prevents problems in practice, if not *something* to do with how we
compute OldestXmin?
Leaving aside the specifics of how OldestXmin is computed for a
moment: what alternative explanation is even remotely plausible? There
just aren't that many moving parts involved here. The idea that we can
ever freeze the xmin of a successor tuple/version from an update chain
without also pruning away earlier versions of the same chain is wildly
implausible. It sounds totally contradictory.
> In fact
> we have explicit code trying to exploit this:
>
> /*
> * If the DEAD tuple is at the end of the chain, the entire chain is
> * dead and the root line pointer can be marked dead. Otherwise just
> * redirect the root to the correct chain member.
> */
> if (i >= nchain)
> heap_prune_record_dead(prstate, rootoffnum);
> else
> heap_prune_record_redirect(prstate, rootoffnum, chainitems[i]);
I don't see why this code is relevant.
--
Peter Geoghegan