Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum
От | Peter Geoghegan |
---|---|
Тема | Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum |
Дата | |
Msg-id | CAH2-WzmeBU-WgDY_X=EAGkLf4bw8gPRB62HRvt++JHU3TsERgA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum (Dmitry Dolgov <9erthalion6@gmail.com>) |
Список | pgsql-bugs |
On Sat, Nov 13, 2021 at 7:05 AM Dmitry Dolgov <9erthalion6@gmail.com> wrote: > I've got curious if modifying the Alexander's test case could reveal > something interesting, and sprinkled it with savepoints and rollbacks. > Almost immediately a new problem has manifested itself, although the > crash has nothing to do with the disconnected tuples as far as I can > tell -- still probably worth mentioning. In this case vacuum invoked > lazy_scan_prune, and during the first scan one of the chains had a > HEAPTUPLE_DEAD at the third position. The processing flow fell through > to heap_prune_record_prunable and crashed on an assert with an > InvalidTransactionId: Is this just with the bugfix applied? I think that it is. Looks like a minor bug to me. I think that I need to consistently "break" in the DEAD case, to avoid ending up here. In other words, it should not literally be "reinterpreted" as RECENTLY_DEAD by falling through in the switch statement (though the concept of reinterpreting certain DEAD tuples as RECENTLY_DEAD still seems perfectly sound). Here's why the assertion (invalid xmax/update xid cannot be used in heap_prune_record_prunable() call) fails: DEAD means that you might not have a valid update XID -- aborted update is what we expect. But RECENTLY_DEAD means that there must have been a deleter xact, and that the xact must have committed (can't have been that the inserter aborted). This is a consequence of the fact that the tuple is at least still visible to somebody (or could be), unlike in the DEAD case. And so xmax must be a valid XID, and so the existing RECENTLY_DEAD case handling can legitimately always expect that. But I cannot (and should not) allow a call to heap_prune_record_prunable() with a DEAD-to-HTSV tuple, even when I "reinterpret" it as RECENTLY_DEAD in order to make a clean determination of the tuple to delete up until for the entire HOT chain. -- Peter Geoghegan
В списке pgsql-bugs по дате отправления: