Re: Failures in constraints regression test, "read only 0 of 8192 bytes"

Поиск
Список
Период
Сортировка
От Thomas Munro
Тема Re: Failures in constraints regression test, "read only 0 of 8192 bytes"
Дата
Msg-id CA+hUKG+XOrCi3UwiK5dNL_B8Eav6hMk334L4Qpctfw4MPDUYaw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Failures in constraints regression test, "read only 0 of 8192 bytes"  (Thomas Munro <thomas.munro@gmail.com>)
Ответы Re: Failures in constraints regression test, "read only 0 of 8192 bytes"  (Thomas Munro <thomas.munro@gmail.com>)
Список pgsql-hackers
On Sun, Mar 10, 2024 at 5:02 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> Thanks, reproduced here (painfully slowly).  Looking...

I changed that ERROR to a PANIC and now I can see that
_bt_metaversion() is failing to read a meta page (block 0), and the
file is indeed of size 0 in my filesystem.  Which is not cool, for a
btree.  Looking at btbuildempty(), we have this sequence:

    bulkstate = smgr_bulk_start_rel(index, INIT_FORKNUM);

    /* Construct metapage. */
    metabuf = smgr_bulk_get_buf(bulkstate);
    _bt_initmetapage((Page) metabuf, P_NONE, 0, allequalimage);
    smgr_bulk_write(bulkstate, BTREE_METAPAGE, metabuf, true);

    smgr_bulk_finish(bulkstate);

Ooh.  One idea would be that the smgr lifetime stuff is b0rked,
introducing corruption.  Bulk write itself isn't pinning the smgr
relation, it's relying purely on the object not being invalidated,
which the theory of 21d9c3ee's commit message allowed for but ... here
it's destroyed (HASH_REMOVE'd) sooner under CACHE_CLOBBER_ALWAYS,
which I think we failed to grok.  If that's it, I'm surprised that
things don't implode more spectacularly.  Perhaps HASH_REMOVE should
clobber objects in debug builds, similar to pfree?

For that hypothesis, the corruption might not be happening in the
above-quoted code itself, because it doesn't seem to have an
invalidation acceptance point (unless I'm missing it).  Some other
bulk write got mixed up?  Not sure yet.

I won't be surprised if the answer is: if you're holding a reference,
you have to get a pin (referring to bulk_write.c).



В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Leung, Anthony"
Дата:
Сообщение: Re: Allow non-superuser to cancel superuser tasks.
Следующее
От: Thomas Munro
Дата:
Сообщение: Re: Failures in constraints regression test, "read only 0 of 8192 bytes"