Re: post-recovery amcheck expectations

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: post-recovery amcheck expectations
Дата
Msg-id CAH2-WzmQJoLRSN6xTtK6ROyhDEX-m0+z8WJ8hSDtgrc4moVsRg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: post-recovery amcheck expectations  (Noah Misch <noah@leadboat.com>)
Ответы Re: post-recovery amcheck expectations  (Noah Misch <noah@leadboat.com>)
Список pgsql-hackers
On Mon, Oct 23, 2023 at 7:28 PM Noah Misch <noah@leadboat.com> wrote:
> > That makes sense to me. I believe that it's not possible to have a
> > string of consecutive sibling pages that are all half-dead (regardless
> > of the BlockNumber order of sibling pages, even). But I'd probably
> > have written the fix in roughly the same way. Although...maybe you
> > should try to detect a string of half-dead pages? Hard to say if it's
> > worth the trouble.
>
> I imagined a string of half-dead siblings could arise in structure like this:
>
>  *               1
>  *          /    |    \
>  *         4 <-> 2 <-> 3
>
> With events like this:
>
> - DELETE renders blk 4 deletable.
> - Crash with concurrent VACUUM, leaving 4 half-dead after having visited 1-4.
> - DELETE renders blk 2 deletable.
> - Crash with concurrent VACUUM, leaving 2 half-dead after having visited 1-2.
>
> I didn't try to reproduce that, and something may well prevent it.

FWIW a couple of factors prevent it (in the absence of corruption). These are:

1. Only VACUUM can delete pages, and in general the only possible
source of half-dead pages is an unfortunately timed crash/error within
VACUUM. Each interrupted VACUUM can leave behind at most one half-dead
page.

2. One thing that makes VACUUM back out of deleting an empty page is
the presence of a half-dead right sibling leaf page left behind by
some VACUUM that was interrupted at some point in the past -- see
_bt_rightsib_halfdeadflag() for details.

Obviously, factors 1 and 2 together make three consecutive half-dead
sibling pages impossible. I'm not quite prepared to say that even two
neighboring half-dead sibling pages are an impossibility right now,
but I think that it might well be. Possibly for reasons that are more
accidental than anything else (is the _bt_rightsib_halfdeadflag thing
a "real invariant" or just something we do because we don't want to
add additional complicated handling to nbtpage.c?), so I'll avoid
going into further detail for now.

I'm pointing this out because it argues for softening the wording
about "accept[ing] an arbitrarily-long chain of half-dead,
sibling-linked pages to the left" from your patch.

I was also wondering (mostly to myself) about the relationship (if
any) between the _bt_rightsib_halfdeadflag/_bt_leftsib_splitflag
"invariants" and the bt_child_highkey_check() check. But I don't think
that it makes sense to put that in scope -- your fix seems like a
strict improvement. This relationship is perhaps in scope here to the
limited extent that talking about strings of consecutive half-dead
pages might make it even harder to understand the design of the
bt_child_highkey_check() check. On the other hand...I'm not sure that
I understand every nuance of it myself.

> > Suggest adding a CHECK_FOR_INTERRUPTS() call to the loop, too, just
> > for good luck.
>
> Added.  That gave me the idea to check for circular links, like other parts of
> amcheck do.

Good idea.

--
Peter Geoghegan



В списке pgsql-hackers по дате отправления:

Предыдущее
От: shveta malik
Дата:
Сообщение: Re: Synchronizing slots from primary to standby
Следующее
От: "David E. Wheeler"
Дата:
Сообщение: Re: Patch: Improve Boolean Predicate JSON Path Docs