Re: [BUGS] Re: BUG #14680: startup process on standby encounter adeadlock of TwoPhaseStateLock when redo 2PC xlog

Поиск

Список

Период

Сортировка

От	Alvaro Herrera
Тема	Re: [BUGS] Re: BUG #14680: startup process on standby encounter adeadlock of TwoPhaseStateLock when redo 2PC xlog
Дата	14 июня 2017 г. 02:08:12
Msg-id	20170613230812.uvlhjlvfbad7njb7@alvherre.pgsql обсуждение исходный текст
Ответ на	Re: [BUGS] Re: BUG #14680: startup process on standby encounter adeadlock of TwoPhaseStateLock when redo 2PC xlog (Michael Paquier <michael.paquier@gmail.com>)
Ответы	Re: [BUGS] Re: BUG #14680: startup process on standby encounter adeadlock of TwoPhaseStateLock when redo 2PC xlog
Список	pgsql-bugs

Дерево обсуждения

Michael Paquier wrote:

> The current coding is actually safe because the checkpointer does not
> remove or add any 2PC entry in the array while holding
> TwoPhaseStateLock, it just updates some values that need to be read
> and/or written while holding the lock. Well, to be honest, HEAD is
> wrong because it can read a flag value while the checkpointer updates
> it, and the patch is careful to change that to be correct. The wrong
> part is when calling ProcessTwoPhaseBuffer() in
> RecoverPreparedTransactions() which accesses gxact->ondisk and
> prepare_start_lsn without locking things.

Honestly I don't like this rationale very much.  Even if doing the
unlocked access is safe today, it looks like installing a landmine for
the future, and for what?  I don't think there's a lot to be gained:
RecoverPreparedTransactions only runs once in the life of a server, and
CheckPointTwoPhase is supposed to have a very short runtime (per
explanation in comments therein).  It seems better to me to continue our
tradition of using the appropriate locks instead of playing a dangerous
game.

So I propose that RecoverPreparedTransactions grabs exclusive lock at
the top, and only the bottom part of the loop is done unlocked, which
AFAICS should be safe.  (MarkAsPrepared gained a boolean argument
indicating that caller already holds lock).

Here's a patch along those lines.  The full testsuite is running now,
but the recovery tests pass fine.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Вложения

2pc-redo-lwlock-fix-v6.patch

В списке pgsql-bugs по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: [BUGS] Re: BUG #14680: startup process on standby encounter adeadlock of TwoPhaseStateLock when redo 2PC xlog

Вложения