Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune()

Поиск

Список

Период

Сортировка

От	Andres Freund
Тема	Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune()
Дата	15 апреля 2024 г. 17:39:13
Msg-id	20240415173913.4zyyrwaftujxthf2@awork3.anarazel.de обсуждение исходный текст
Ответ на	Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune() (Noah Misch <noah@leadboat.com>)
Ответы	Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune()
Список	pgsql-bugs

Дерево обсуждения

Hi,

I've tried a couple times to catch up with this thread. But always kinda felt
I must be missing something. It might be that this is one part of the
confusion:

On 2024-01-06 12:24:13 -0800, Noah Misch wrote:
> Fair enough.  While I agree there's a decent chance back-patching would be
> okay, I think there's also a decent chance that 1ccc1e05ae creates the problem
> Matthias theorized.  Something like: we update relfrozenxid based on
> OldestXmin, even though GlobalVisState caused us to retain a tuple older than
> OldestXmin.  Then relfrozenxid disagrees with table contents.

Looking at the state as of 1ccc1e05ae, I don't see how - in lazy_scan_prune(),
if heap_page_prune() spuriously didn't prune a tuple, because the horizon went
backwards, we'd encounter the tuple in the loop below and call
heap_prepare_freeze_tuple(), which would error out with one of

    /*
     * Process xmin, while keeping track of whether it's already frozen, or
     * will become frozen iff our freeze plan is executed by caller (could be
     * neither).
     */
    xid = HeapTupleHeaderGetXmin(tuple);
    if (!TransactionIdIsNormal(xid))
        xmin_already_frozen = true;
    else
    {
        if (TransactionIdPrecedes(xid, cutoffs->relfrozenxid))
            ereport(ERROR,
                    (errcode(ERRCODE_DATA_CORRUPTED),
                     errmsg_internal("found xmin %u from before relfrozenxid %u",
                                     xid, cutoffs->relfrozenxid)));

or
        if (TransactionIdPrecedes(update_xact, cutoffs->relfrozenxid))
            ereport(ERROR,
                    (errcode(ERRCODE_DATA_CORRUPTED),
                     errmsg_internal("multixact %u contains update XID %u from before relfrozenxid %u",
                                     multi, update_xact,
                                     cutoffs->relfrozenxid)));
or
        /* Raw xmax is normal XID */
        if (TransactionIdPrecedes(xid, cutoffs->relfrozenxid))
            ereport(ERROR,
                    (errcode(ERRCODE_DATA_CORRUPTED),
                     errmsg_internal("found xmax %u from before relfrozenxid %u",
                                     xid, cutoffs->relfrozenxid)));


I'm not saying that spuriously erroring out would be ok. But I guess I just
don't understand the data corruption theory in this subthread, because we'd
error out if we encountered a tuple that should have been frozen but wasn't?

Greetings,

Andres Freund

В списке pgsql-bugs по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune()