Re: New WAL record to detect the checkpoint redo location

Поиск

Список

Период

Сортировка

От	Robert Haas
Тема	Re: New WAL record to detect the checkpoint redo location
Дата	9 октября 2023 г. 19:58:36
Msg-id	CA+Tgmob90XtBJ+WdSHA+y8ZikHPSxQvZwYDefEBLC7Bkwx=3-g@mail.gmail.com обсуждение исходный текст
Ответ на	Re: New WAL record to detect the checkpoint redo location (Andres Freund <andres@anarazel.de>)
Ответы	Re: New WAL record to detect the checkpoint redo location
Список	pgsql-hackers

Дерево обсуждения

On Thu, Oct 5, 2023 at 2:34 PM Andres Freund <andres@anarazel.de> wrote:
> One thing that's notable, but not related to the patch, is that we waste a
> fair bit of cpu time below XLogInsertRecord() with divisions. I think they're
> all due to the use of UsableBytesInSegment in
> XLogBytePosToRecPtr/XLogBytePosToEndRecPtr.  The multiplication of
> XLogSegNoOffsetToRecPtr() also shows.

Despite what I said in my earlier email, and with a feeling like unto
that created by the proximity of the sword of Damocles or some ghostly
albatross, I spent some time reflecting on this. Some observations:

1. The reason why we're doing this multiplication and division is to
make sure that the code in ReserveXLogInsertLocation which executes
while holding insertpos_lck remains as simple and brief as possible.
We could eliminate the conversion between usable byte positions and
LSNs if we replaced Insert->{Curr,Prev}BytePos with LSNs and had
ReserveXLogInsertLocation work out by how much to advance the LSN, but
it would have to be worked out while holding insertpos_lck (or some
replacement lwlock, perhaps) and that cure seems worse than the
disease. Given that, I think we're stuck with converting between
usable bye positions and LSNs, and that intrinsically needs some
multiplication and division.

2. It seems possible to remove one branch in each of
XLogBytePosToRecPtr and XLogBytePosToEndRecPtr. Rather than testing
whether bytesleft < XLOG_BLCKSZ - SizeOfXLogLongPHD, we could simply
increment bytesleft by SizeOfXLogLongPHD - SizeOfXLogShortPHD. Then
the rest of the calculations can be performed as if every page in the
segment had a header of length SizeOfXLogShortPHD, with no need to
special-case the first page. However, that doesn't get rid of any
multiplication or division, just a branch.

3. Aside from that, there seems to be no simple way to reduce the
complexity of an individual calculation, but ReserveXLogInsertLocation
does perform 3 rather similar computations, and I believe that we know
that it will always be the case that *PrevPtr < *StartPos < *EndPos.
Maybe we could have a fast-path for the case where they are all in the
same segment. We could take prevbytepos modulo UsableBytesInSegment;
call the result prevsegoff. If UsableBytesInSegment - prevsegoff >
endbytepos - prevbytepos, then all three pointers are in the same
segment, and maybe we could take advantage of that to avoid performing
the segment calculations more than once, but still needing to repeat
the page calculations. Or, instead or in addition, I think we could by
a similar technique check whether all three pointers are on the same
page; if so, then *StartPos and *EndPos can be computed from *PrevPtr
by just adding the difference between the corresponding byte
positions.

I'm not really sure whether that would come out cheaper. It's just the
only idea that I have. It did also occur to me to wonder whether the
apparent delays performing multiplication and division here were
really the result of the arithmetic itself being slow or whether they
were synchronization-related, SpinLockRelease(&Insert->insertpos_lck)
being a memory barrier just before. But I assume you thought about
that and concluded that wasn't the issue here.

--
Robert Haas
EDB: http://www.enterprisedb.com

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: New WAL record to detect the checkpoint redo location