Re: documentation on HOT

Поиск
Список
Период
Сортировка
От Peter Geoghegan
Тема Re: documentation on HOT
Дата
Msg-id CAH2-WznrTwM7iZ2_sHotqunBfLdCUD6y=uHJOzmVJ+NT46S0rg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: documentation on HOT  (Bruce Momjian <bruce@momjian.us>)
Ответы Re: documentation on HOT  (Bruce Momjian <bruce@momjian.us>)
Список pgsql-docs
On Fri, Jul 22, 2022 at 2:11 PM Bruce Momjian <bruce@momjian.us> wrote:
> I have improved the wording of the last paragraph in this patch.

I think that it would be worth prominently explaining where heap-only
tuples get their name from: it comes from the fact there are (by
definition) no entries for a heap-only tuple in any index, ever.
Indexes are nevertheless capable of locating heap-only tuples during
index scans, by dealing with a little additional indirection: they
must traverse groups of related tuple versions, all for the same
logical row that was HOT updated one or more times -- this group of
related tuples is called a HOT chain.

This seems like a useful thing to emphasize because it places the
emphasis on what *doesn't* happen. Mostly what doesn't happen in
indexes.

New item identifiers actually *are* needed for heap-only tuples
(perhaps we could get away with it, but we don't). However, that
doesn't really matter too much in practice. Heap-only tuples can still
have their line pointers set to LP_UNUSED directly during pruning,
without having to be set to LP_DEAD for a time first (a situation
which VACUUM alone can correct by setting the LP_DEAD items to
LP_UNUSED during its second heap pass).

So heap-only tuples "skip the step" where they have to become LP_DEAD
stubs/tombstones. Which is possible precisely because indexes don't
need to be considered (they're "heap-only").

I agree that pruning should be discussed here, though -- I wouldn't go
as far as treating pruning as 100% unrelated to HOT. Perhaps something
along the lines of this works:

"It is possible for opportunistic pruning to completely remove all
bloat caused by HOT updates (bloat from HOT chains), without leaving
any residual garbage that only VACUUM is capable of cleaning up.
Pruning a page affected by non-HOT updates or deletes is somewhat less
effective, though, because small tombstone items (dead item
identifiers) must remain until such time as VACUUM can verify that no
remaining index tuples reference the items."

Again, the emphasis is on what *doesn't* have to happen because
indexes aren't making life hard for us. From the point of view of
indexes, ignorance is bliss. The really nice important point about
pruning and HOT is that it becomes possible (with care from the DBA
and application) to practically eliminate the role of VACUUM. We may
not even require a little help from VACUUM, under ideal conditions.

-- 
Peter Geoghegan



В списке pgsql-docs по дате отправления:

Предыдущее
От: Bruce Momjian
Дата:
Сообщение: Re: documentation on HOT
Следующее
От: PG Doc comments form
Дата:
Сообщение: Bug in code sample in "8.15.5. Searching in Arrays"