Re: amcheck's verify_heapam(), and HOT chain verification
От | Mark Dilger |
---|---|
Тема | Re: amcheck's verify_heapam(), and HOT chain verification |
Дата | |
Msg-id | 4A85CA3D-DCDD-494C-9D79-99883610AF50@enterprisedb.com обсуждение исходный текст |
Ответ на | amcheck's verify_heapam(), and HOT chain verification (Peter Geoghegan <pg@bowt.ie>) |
Ответы |
Re: amcheck's verify_heapam(), and HOT chain verification
(Peter Geoghegan <pg@bowt.ie>)
|
Список | pgsql-hackers |
> On Nov 5, 2021, at 7:51 PM, Peter Geoghegan <pg@bowt.ie> wrote: > > I recently pushed a commit which formalized what we expect of HOT > chains during pruning -- see assertions and comments added by commit > 5cd7eb1f, "Add various assertions to heap pruning code". I notice that > verify_heapam() doesn't really do anything with HOT chains, and I'd > say that that's a gap. Fortunately it should not be terribly difficult > to fill in this gap. > > The short version is this: I think that verify_heapam() should > validate whole HOT chains, not just individual heap-only tuples. This > can be accomplished by taking a similar approach to the approach taken > in heap_prune_chain() already, for pruning. > > For a given heap page, pruning fills in its PruneState variable to > build a coherent picture of the state of all HOT chains on the entire > page, before modifying anything (barring hint bit setting). We only > reach heap_page_prune_execute() when we have the whole picture about > HOT chains. This high level approach is essential when pruning. > Consider LP_REDIRECT items. We might need to set an existing > LP_REDIRECT item to LP_DEAD, rendering an entire HOT chain dead. > Whether or not that's the appropriate course of action for a given > LP_REDIRECT item will depend on the state of the HOT chain as a whole. > > The HOT chain could of course have several heap-only tuples, and only > one visible tuple means that the LP_REDIRECT root item must not be set > LP_DEAD -- it should be made to link to the remaining live heap-only > tuple instead (while the dead intermediate heap-only tuples become > LP_UNUSED, never LP_DEAD). This is made even more tricky by the fact > that we have to consider HOT updaters with aborted transactions during > pruning. > > We even have to consider the case where a HOT updater aborts, but > another HOT updater tries again and commits. Now we have a heap-only > tuple that is "disconnected from its original HOT chain" (or will be > until the next prune). Having a stray/disconnected heap-only tuple > like this ought to be fine -- the next pruning operation will get to > it whenever (see the special handling for these aborted heap-only > tuples at the top of heap_prune_chain()). But any disconnected > heap-only tuple had better be from an aborted xact -- otherwise we > have corruption. This is corruption that looks like index corruption, > because sequential scans still give correct answers (I've seen this > several times before, ask me how). But it's not index corruption, > contrary to appearances -- it's heap corruption. > > Here are some specific checks I have in mind: > > * Right now, we throw an error when an LP_REDIRECT points to an > LP_UNUSED item. But that existing test should go further, like the > assertions I added recently. They should make sure that the > LP_REDIRECT links directly to an LP_NORMAL item, which has tuple > storage. And, the stored tuple must be a heap-only tuple. > > Note, in particular, that it's not okay if the LP_REDIRECT points to > an LP_DEAD item (which amcheck won't complain about today). That is at > least my position on it, and so far it looks like the right position > (no assertion failures reported yet). > > * HOT chain validation: All visible/committed heap-only tuples should > be reachable through a "recognizable root item for their HOT chain, or > an earlier member of the HOT chain via its t_ctid link". The root item > can either be an LP_REDIRECT, or a regular not-heap-only tuple -- but > it can't be anything else. > > * All other heap-only tuples (all heap-only tuples that don't seem to > be part of a valid HOT chain) must be from aborted transactions -- as > I said, this has to be corruption (heap corruption disguised as index > corruption, even). > > Note that we're very permissive about broken HOT chains -- in general > a heap-only tuple's t_ctid link can have pretty wild values, and > that's okay. For example, it can point past the current bounds of the > page's line pointer array. Everything is still robust because of the > way that we validate HOT chains during chain traversal -- we match > tuple N's xmax to tuple N+1's xmin, all the time, inside code like > heap_hot_search_buffer() that is routinely used during index scans. > However, the root item of a HOT chain (usually an LP_REDIRECT) > absolutely must point to a valid heap-only tuple at all times -- > that's very different indeed. That's the starting point for an index > scan -- it is the TID that is stored in indexes that needs to be > stable over time. And so pruning needs to leave behind coherent > looking whole entire HOT chains, just as we ought to verify that > pruning consistently gets it right. Thanks, Peter, for this analysis. It's getting a bit late in the day, but I'll try this out tomorrow, I expect. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Peter GeogheganДата:
Сообщение: amcheck's verify_heapam(), and HOT chain verification
Следующее
От: Pavel StehuleДата:
Сообщение: Re: Schema variables - new implementation for Postgres 15