Re: BUG #13128: Postgres deadlock on startup failure when max_prepared_transactions is not sufficiently high.
От | Heikki Linnakangas |
---|---|
Тема | Re: BUG #13128: Postgres deadlock on startup failure when max_prepared_transactions is not sufficiently high. |
Дата | |
Msg-id | 5538A018.9030608@iki.fi обсуждение исходный текст |
Ответ на | Re: BUG #13128: Postgres deadlock on startup failure when max_prepared_transactions is not sufficiently high. (Michael Paquier <michael.paquier@gmail.com>) |
Ответы |
Re: BUG #13128: Postgres deadlock on startup failure when
max_prepared_transactions is not sufficiently high.
|
Список | pgsql-bugs |
On 04/23/2015 04:20 AM, Michael Paquier wrote: > On Thu, Apr 23, 2015 at 7:09 AM, <grant@amazon.com> wrote: >> 1. Set max_prepared_transactions to 2 in postgresql.conf. >> 2. start Postgres. >> 3. Create two uncommitted prepared transactions: BEGIN; PREPARE >> TRANSACTION 'test1'; BEGIN; PREPARE TRANSACTION 'test2'; >> 4. Set max_prepared_transactions to 1 in postgresql.conf. >> 5. Restart Postgres again. >> >> At this point the startup will fail with a fatal but the postmaster process >> keeps running. >> >> LOG: database system was interrupted; last known up at 2015-04-16 17:19:56 >> PDT >> LOG: database system was not properly shut down; automatic recovery in >> progress >> LOG: record with zero length at 0/1826C70 >> LOG: redo is not required >> LOG: recovering prepared transaction 1685 >> LOG: recovering prepared transaction 1683 >> FATAL: maximum number of prepared transactions reached >> HINT: Increase max_prepared_transactions (currently 1). >> >> Looks like their may be a LWLock that is not correctly getting released >> which causes the process to hang rather then exit. > > Yep, the startup process remains stuck here, so we should release the > lock before issuing ERROR in twophase.c: As Tom noted in the other thread, that is not generally required. LWLockReleaseAll() is call during process exit. Hmm. What happens here is that when the startup process is about to exit, because of the ERROR, it calls the shmem-exit hooks. AtProcExit_Twophase sees that MyLockedGxact != NULL, and it tries to clear MyLockedGxact->locking_backend to release the entry. That's bogus. MyLockedGxact != NULL indicates that the backend is currently operating on the entry. But that's not the case at that point during RecoverPreparedTransactions(). It has already completed recovering the previous transaction, and has not yet locked a GlobalTransaction entry for the next one. RecoverPreparedTransactions() should clear MyLockedGxact after it has recovered the transaction, after the StandbyReleaseLockTree() call. That's analogous to the normal PREPARE TRANSACTION path, in PrepareTransaction(), where we call PostPrepare_Twophase() after releasing the locks. Attached is a patch for that. I introduced this bug in commit bb38fb0d43c8d7ff54072bfd8bd63154e536b384, which added the proc-exit hook and the call PostPrepare_Twophase(), so unless you see more bugs in this, I'll commit and backpatch this to all versions. - Heikki
Вложения
В списке pgsql-bugs по дате отправления: