walsender performance regression due to logical decoding on standby changes

Поиск

Список

Период

Сортировка

От	Andres Freund
Тема	walsender performance regression due to logical decoding on standby changes
Дата	9 мая 2023 г. 19:02:47
Msg-id	20230509190247.3rrplhdgem6su6cg@awork3.anarazel.de обсуждение исходный текст
Ответы	Re: walsender performance regression due to logical decoding on standby changes Re: walsender performance regression due to logical decoding on standby changes RE: walsender performance regression due to logical decoding on standby changes
Список	pgsql-hackers

Дерево обсуждения

Hi,

Unfortunately I have found the following commit to have caused a performance
regression:

commit e101dfac3a53c20bfbf1ca85d30a368c2954facf
Author: Andres Freund <andres@anarazel.de>
Date:   2023-04-08 00:24:24 -0700

    For cascading replication, wake physical and logical walsenders separately

    Physical walsenders can't send data until it's been flushed; logical
    walsenders can't decode and send data until it's been applied. On the
    standby, the WAL is flushed first, which will only wake up physical
    walsenders; and then applied, which will only wake up logical
    walsenders.

    Previously, all walsenders were awakened when the WAL was flushed. That
    was fine for logical walsenders on the primary; but on the standby the
    flushed WAL would have been not applied yet, so logical walsenders were
    awakened too early.

    Per idea from Jeff Davis and Amit Kapila.

    Author: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>
    Reviewed-By: Jeff Davis <pgsql@j-davis.com>
    Reviewed-By: Robert Haas <robertmhaas@gmail.com>
    Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
    Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
    Discussion: https://postgr.es/m/CAA4eK1+zO5LUeisabX10c81LU-fWMKO4M9Wyg1cdkbW7Hqh6vQ@mail.gmail.com

The problem is that, on a standby, after the change - as needed to for the
approach to work - the call to WalSndWakeup() in ApplyWalRecord() happens for
every record, instead of only happening when the timeline is changed (or WAL
is flushed or ...).

WalSndWakeup() iterates over all walsender slots, regardless of whether in
use. For each of the walsender slots it acquires a spinlock.

When replaying a lot of small-ish WAL records I found the startup process to
spend the majority of the time in WalSndWakeup(). I've not measured it very
precisely yet, but the overhead is significant (~35% slowdown), even with the
default max_wal_senders. If that's increased substantially, it obviously gets
worse.

The only saving grace is that this is not an issue on the primary.


I unfortunately spent less time on this commit of the
logical-decoding-on-standby series than on the others. There were so many
other senior contributors discussing it, that I "relaxed" a bit too much.


I don't think the approach of not having any sort of "registry" of whether
anybody is waiting for the replay position to be updated is
feasible. Iterating over all walsenders slots is just too expensive -
WalSndWakeup() shows up even if I remove the spinlock (which we likely could,
especially when just checking if the the walsender is connected).

My current guess is that mis-using a condition variable is the best bet. I
think it should work to use ConditionVariablePrepareToSleep() before a
WalSndWait(), and then ConditionVariableCancelSleep(). I.e. to never use
ConditionVariableSleep(). The latch set from ConditionVariableBroadcast()
would still cause the necessary wakeup.

Greetings,

Andres Freund

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

walsender performance regression due to logical decoding on standby changes