Re: BUG #13667: SSI violation...
От | Kevin Grittner |
---|---|
Тема | Re: BUG #13667: SSI violation... |
Дата | |
Msg-id | 1285273220.277372.1446226561203.JavaMail.yahoo@mail.yahoo.com обсуждение исходный текст |
Ответ на | Re: BUG #13667: SSI violation... (Kevin Grittner <kgrittn@ymail.com>) |
Ответы |
Re: BUG #13667: SSI violation...
|
Список | pgsql-bugs |
On Friday, October 9, 2015 4:49 PM, Kevin Grittner <kgrittn@ymail.com> wrote: > I agree this is a bug, and it behave like a race condition with a > very narrow window of time for the second process to hit. It has > proven very hard to find the cause. Thanks to Thomas Munro joining me in a 2.5 day marathon hunt for this bug, we have found it and squashed it with the attached patch. The issue in heap_insert() in heapam.c was the bug that was actually causing the failures in the specific tests we had, but other locations doing insert, update, or delete had similar bugs, and we found a few stray issues in predicate.c that are also fixed here. The basic problem was that in the initial implementation we tried a little too hard to optimize, at the expense of correctness. When we were going to insert, for example, we checked for an existing predicate lock on the relation first, to avoid the effort of inserting the tuple, with WAL-logging, if we were just going to roll back the transaction anyway because we found a rw-conflict that completed a "dangerous structure". The problem is, it allowed this sequence to create an undetected serialization anomaly: T1 ---------- T2 ---------- lock read checklocks lock read insert At this point things are hosed, regardless of timings past this. The window in the test was small because T2 has to acquire its SIReadLock and scan the whole table before T1 manages to insert a tuple, but with enough contention T1 can stall long enough to see this happen. When T2 checks locks prior to its insert it will detect the rw-conflict from T1 to T2, but the rw-conflict in the other direction won't be detected, so no dangerous structure is formed and nothing is rolled back. Moving the check past the insert would fix the problem, but the reason it was put in front is enshrined in a comment at each problem location; for example: * We're about to do the actual insert -- but check for conflict first, to * avoid possibly having to roll back work we've just done. These checks are about as close to free as you can get if the transaction doing the check is not serializable; it doesn't even need to take out a LW lock to determine there is nothing to be done. The reason given in the comment still has merit for serializable transactions; even for them the check is orders of magnitude cheaper than a WAL logged tuple insert. It only requires an occasional serialization failure detection there to come out ahead. So rather than move the existing check, we added a recheck after. Barring objections I will push this tomorrow, including back-patching it to all supported branches. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Вложения
В списке pgsql-bugs по дате отправления: