Re: pgsql: Avoid duplicate XIDs at recovery when building initialsnapshot
От | Andres Freund |
---|---|
Тема | Re: pgsql: Avoid duplicate XIDs at recovery when building initialsnapshot |
Дата | |
Msg-id | 20181014174240.byktkskdymae7kmy@alap3.anarazel.de обсуждение исходный текст |
Ответ на | pgsql: Avoid duplicate XIDs at recovery when building initial snapshot (Michael Paquier <michael@paquier.xyz>) |
Список | pgsql-committers |
On 2018-10-14 13:26:24 +0000, Michael Paquier wrote: > Avoid duplicate XIDs at recovery when building initial snapshot > > On a primary, sets of XLOG_RUNNING_XACTS records are generated on a > periodic basis to allow recovery to build the initial state of > transactions for a hot standby. The set of transaction IDs is created > by scanning all the entries in ProcArray. However it happens that its > logic never counted on the fact that two-phase transactions finishing to > prepare can put ProcArray in a state where there are two entries with > the same transaction ID, one for the initial transaction which gets > cleared when prepare finishes, and a second, dummy, entry to track that > the transaction is still running after prepare finishes. This way > ensures a continuous presence of the transaction so as callers of for > example TransactionIdIsInProgress() are always able to see it as alive. > > So, if a XLOG_RUNNING_XACTS takes a standby snapshot while a two-phase > transaction finishes to prepare, the record can finish with duplicated > XIDs, which is a state expected by design. If this record gets applied > on a standby to initial its recovery state, then it would simply fail, > so the odds of facing this failure are very low in practice. It would > be tempting to change the generation of XLOG_RUNNING_XACTS so as > duplicates are removed on the source, but this requires to hold on > ProcArrayLock for longer and this would impact all workloads, > particularly those using heavily two-phase transactions. > > XLOG_RUNNING_XACTS is also actually used only to initialize the standby > state at recovery, so instead the solution is taken to discard > duplicates when applying the initial snapshot. > > Diagnosed-by: Konstantin Knizhnik > Author: Michael Paquier > Discussion: https://postgr.es/m/0c96b653-4696-d4b4-6b5d-78143175d113@postgrespro.ru > Backpatch-through: 9.3 I'm unhappy this approach was taken over objections. Without a real warning. Even leaving the crummyness aside, did you check other users of XLOG_RUNNING_XACTS, e.g. logical decoding? Greetings, Andres Freund
В списке pgsql-committers по дате отправления: