Re: Eliminate redundant tuple visibility check in vacuum

Поиск
Список
Период
Сортировка
От David Geier
Тема Re: Eliminate redundant tuple visibility check in vacuum
Дата
Msg-id ed6011e9-551f-6156-25c7-7e5ff068fcf4@gmail.com
обсуждение исходный текст
Ответ на Re: Eliminate redundant tuple visibility check in vacuum  (Peter Geoghegan <pg@bowt.ie>)
Список pgsql-hackers
Hi,

On 9/1/23 03:25, Peter Geoghegan wrote:
> On Thu, Aug 31, 2023 at 3:35 PM Melanie Plageman
> <melanieplageman@gmail.com> wrote:
>> Any inserting transaction which aborts after heap_page_prune()'s
>> visibility check will now be of no concern to lazy_scan_prune(). Since
>> we don't do the visibility check again, we won't find the tuple
>> HEAPTUPLE_DEAD and thus won't have the problem of adding the tuple to
>> the array of tuples for vacuum to reap. This does mean that we won't
>> reap and remove tuples whose inserting transactions abort right after
>> heap_page_prune()'s visibility check. But, this doesn't seem like an
>> issue.
> It's definitely not an issue.
>
> The loop is essentially a hacky way of getting a consistent picture of
> which tuples should be treated as HEAPTUPLE_DEAD, and which tuples
> need to be left behind (consistent at the level of each page and each
> HOT chain, at least). Making that explicit seems strictly better.
>
>> They may not be removed until the next vacuum, but ISTM it is
>> actually worse to pay the cost of re-pruning the whole page just to
>> clean up that one tuple. Maybe if that renders the page all visible
>> and we can mark it as such in the visibility map -- but that seems
>> like a relatively narrow use case.
> The chances of actually hitting the retry are microscopic anyway. It
> has nothing to do with making sure that dead tuples from aborted
> tuples get removed for its own sake, or anything. Rather, the retry is
> all about making sure that all TIDs that get removed from indexes can
> only point to LP_DEAD stubs. Prior to Postgres 14, HEAPTUPLE_DEAD
> tuples with storage would very occasionally be left behind, which made
> life difficult in a bunch of other places -- for no good reason.
>
That makes sense and seems like a much better compromise. Thanks for the 
explanations. Please update the comment to document the corner case and 
how we handle it.

-- 
David Geier
(ServiceNow)




В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Smith
Дата:
Сообщение: Re: Synchronizing slots from primary to standby
Следующее
От: Amit Kapila
Дата:
Сообщение: Re: persist logical slots to disk during shutdown checkpoint