Re: [BUGS] Re: BUG #14680: startup process on standby encounter adeadlock of TwoPhaseStateLock when redo 2PC xlog

Поиск
Список
Период
Сортировка
От Alvaro Herrera
Тема Re: [BUGS] Re: BUG #14680: startup process on standby encounter adeadlock of TwoPhaseStateLock when redo 2PC xlog
Дата
Msg-id 20170611022536.goef4cdbmgicye5g@alvherre.pgsql
обсуждение исходный текст
Ответ на Re: [BUGS] Re: BUG #14680: startup process on standby encounter adeadlock of TwoPhaseStateLock when redo 2PC xlog  (Michael Paquier <michael.paquier@gmail.com>)
Ответы Re: [BUGS] Re: BUG #14680: startup process on standby encounter adeadlock of TwoPhaseStateLock when redo 2PC xlog  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Список pgsql-bugs
Michael Paquier wrote:

> > i look carefully on the patch, because of we removed TwoPhaseStateLock
> > lwlock acquire in `RemoveGXact' and let caller held lwlock, so i think:
> > 1. xact_redo also need held lwlock before call PrepareRedoRemove
> > 2. RecoverPreparedTransactions also need held lwlock before call
> > ProcessTwoPhaseBuffer
> 
> Thanks for the input. I have reviewed all the code paths that have
> been modified, and strengthened the code with assertions using
> LWLockHeldByMeInMode() to make sure that the correct lock is always
> hold in those code paths. There is actually no point in holding the
> lock in restoreTwoPhaseData(), but as this makes the code less
> consistent with the rest I added one. Also, I have replaced the lock
> acquisition in PrepareRedoAdd() with acquisitions at higher levels,
> and added an assertion in this routine. This makes the 2PC state data
> addition and removal more consistent with each other.

Thanks for the patch -- I agree that the new coding looks better.

I also agree that restoreTwoPhaseData doesn't need to hold the lock,
since we don't expect anything running concurrently, but that it's okay
to hold it and makes the whole thing simpler.

We could try to apply an equivalent rationale to
RecoverPreparedTransactions: even though we have already been running in
HOT standby mode for a while, there's no possibility of concurrent
activity either: since we exited recovery, master cannot write any more
2PC xacts, and we haven't yet set the flag that lets open sessions write
WAL.  However, it seems mildly dangerous to run the bottom sections of
the loop with the lock held, since it acquires other lwlocks and
generally does a lot of crap.

Also, let's add an lwlock-is-held assertion to MarkAsPreparingGuts.

BTW: I think that saving one redundant line of code is not worth the
ugliness that it costs in xact_redo, so let's undo that.  The patch is a
bit bulky, but the resulting code is simpler.

In short, I propose the attached.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Вложения

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Alvaro Herrera
Дата:
Сообщение: Re: [BUGS] Re: BUG #14680: startup process on standby encounter adeadlock of TwoPhaseStateLock when redo 2PC xlog
Следующее
От: Alvaro Herrera
Дата:
Сообщение: Re: [BUGS] Re: BUG #14680: startup process on standby encounter adeadlock of TwoPhaseStateLock when redo 2PC xlog