Re: [BUGS] Re: BUG #14680: startup process on standby encounter adeadlock of TwoPhaseStateLock when redo 2PC xlog

Поиск
Список
Период
Сортировка
От Alvaro Herrera
Тема Re: [BUGS] Re: BUG #14680: startup process on standby encounter adeadlock of TwoPhaseStateLock when redo 2PC xlog
Дата
Msg-id 20170613230812.uvlhjlvfbad7njb7@alvherre.pgsql
обсуждение исходный текст
Ответ на Re: [BUGS] Re: BUG #14680: startup process on standby encounter adeadlock of TwoPhaseStateLock when redo 2PC xlog  (Michael Paquier <michael.paquier@gmail.com>)
Ответы Re: [BUGS] Re: BUG #14680: startup process on standby encounter adeadlock of TwoPhaseStateLock when redo 2PC xlog  (Michael Paquier <michael.paquier@gmail.com>)
Список pgsql-bugs
Michael Paquier wrote:

> The current coding is actually safe because the checkpointer does not
> remove or add any 2PC entry in the array while holding
> TwoPhaseStateLock, it just updates some values that need to be read
> and/or written while holding the lock. Well, to be honest, HEAD is
> wrong because it can read a flag value while the checkpointer updates
> it, and the patch is careful to change that to be correct. The wrong
> part is when calling ProcessTwoPhaseBuffer() in
> RecoverPreparedTransactions() which accesses gxact->ondisk and
> prepare_start_lsn without locking things.

Honestly I don't like this rationale very much.  Even if doing the
unlocked access is safe today, it looks like installing a landmine for
the future, and for what?  I don't think there's a lot to be gained:
RecoverPreparedTransactions only runs once in the life of a server, and
CheckPointTwoPhase is supposed to have a very short runtime (per
explanation in comments therein).  It seems better to me to continue our
tradition of using the appropriate locks instead of playing a dangerous
game.

So I propose that RecoverPreparedTransactions grabs exclusive lock at
the top, and only the bottom part of the loop is done unlocked, which
AFAICS should be safe.  (MarkAsPrepared gained a boolean argument
indicating that caller already holds lock).

Here's a patch along those lines.  The full testsuite is running now,
but the recovery tests pass fine.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Вложения

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Peter Eisentraut
Дата:
Сообщение: Re: [BUGS] Concurrent ALTER SEQUENCE RESTART Regression
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: [BUGS] Re: BUG #14680: startup process on standby encounter adeadlock of TwoPhaseStateLock when redo 2PC xlog