Re: BUG #13681: Serialization failures caused by new multixact code of 9.3 (back-patch request)
От | Alvaro Herrera |
---|---|
Тема | Re: BUG #13681: Serialization failures caused by new multixact code of 9.3 (back-patch request) |
Дата | |
Msg-id | 20151218172413.GS2618@alvherre.pgsql обсуждение исходный текст |
Ответ на | Re: BUG #13681: Serialization failures caused by new multixact code of 9.3 (back-patch request) (Kevin Grittner <kgrittn@gmail.com>) |
Ответы |
Re: BUG #13681: Serialization failures caused by new
multixact code of 9.3 (back-patch request)
|
Список | pgsql-bugs |
Kevin Grittner wrote: > On Thu, Dec 17, 2015 at 12:31 PM, Alvaro Herrera > <alvherre@2ndquadrant.com> wrote: > > we have a transaction that takes a lock-only multi in table > > users, and then when we do the second update we don't look it up > > because ...?? > > The referencing column value did not change. (We would not have > looked up on the first update either, since it also didn't change > there.) > > > And then this causes the test case not to fail because ..? > > The concurrent update of the referencing table is not seen as a > write conflict (because it didn't actually change). Okay, but I don't understand why that particular patch fixes the problem. That patch was only supposed to skip looking up members of multixacts when they were not interesting; so it's just a performance optimization which shouldn't change the behavior of anything. However in this case, it is changing behavior and I'm concerned that it might have introduced other bugs. Tracing through the test case, what happens is that s1's second update calls SELECT FOR SHARE of the users tuple (via the RI code); this calls ExecLockRows which calls heap_lock_tuple which calls HeapTupleSatisfiesUpdate. The latter returns HeapTupleUpdated, because it sees that the tuple has been modified by a transaction that already committed (s2). But we already hold a for-key-share lock on the old version of the tuple -- that HeapTupleUpdated result value should be examined carefully by heap_lock_tuple to determine that, yes, it can acquire the row lock without raising an error. So, if I patch heap_lock_rows like below, the test passes too: diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c index 559970f..aaf8e8e 100644 --- a/src/backend/access/heap/heapam.c +++ b/src/backend/access/heap/heapam.c @@ -4005,7 +4005,7 @@ l3: UnlockReleaseBuffer(*buffer); elog(ERROR, "attempted to lock invisible tuple"); } - else if (result == HeapTupleBeingUpdated) + else if (result == HeapTupleBeingUpdated || result == HeapTupleUpdated) { TransactionId xwait; uint16 infomask; I think heap_lock_rows had that shape (only consider BeingUpdated as a reason to check/wait) only because it was possible to lock a row that was being locked by someone else, but it wasn't possible to lock a row that had been updated by someone else -- which became possible in 9.3. So this patch is necessary, and not just to fix this one bug. I need to study the implications of this patch more closely, but I think in theory it is correct. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
В списке pgsql-bugs по дате отправления: