Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum
| От | Peter Geoghegan |
|---|---|
| Тема | Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum |
| Дата | |
| Msg-id | CAH2-WznUEgHVGuMp5cRC9YnrNKGzfORZOUmmu1ijaNEn-sQyYg@mail.gmail.com обсуждение исходный текст |
| Ответ на | Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum (Peter Geoghegan <pg@bowt.ie>) |
| Ответы |
Re: BUG #17255: Server crashes in index_delete_sort_cmp() due to race condition with vacuum
|
| Список | pgsql-bugs |
On Thu, Nov 11, 2021 at 8:22 PM Peter Geoghegan <pg@bowt.ie> wrote: > On Thu, Nov 11, 2021 at 4:58 PM Peter Geoghegan <pg@bowt.ie> wrote: > > > What prevents the scenario that some other backend e.g. has a snapshot with > > > xmin=xmax=RECENTLY_DEAD-row. If the RECENTLY_DEAD row has an xid that is later > > > than the DEAD row, this afaict would make it perfectly legal to prune the DEAD > > > row, but *not* the RECENTLY_DEAD one. > > > > I'll need to think about this very carefully. I didn't think it was > > worth blocking v3 on, though naturally it's a big concern. > > If we're to traverse HOT chains right to the end in > heap_prune_chain(), reading even LIVE tuples (per the approach > proposed in my bugfix patch), we probably need to be more careful > about concurrently aborted xacts -- relying on the usual > !HeapTupleHeaderIsHotUpdated(htup) test doesn't seem safe. I wonder if we're approaching this business with "RECENTLY_DEAD can be upgraded to DEAD" in entirely the wrong way. Why not just not do that at all anymore, on the off chance that it's unsafe? Why even take a small chance? Our decision has to work at the level of the whole entire HOT chain, and it seems to me that we should make that as simple as possible. I'm pretty sure that we're giving up nothing this way. We can just take the conservative position that it's never okay for heap_prune_chain() to consider a heap-only tuple from a validated HOT chain DEAD, unless A.) HTSV says it is DEAD when asked, and B.) HTSV said the same thing (tuple DEAD) about any and all earlier heap-only tuples in the chain. Moreover, if HTSV says LIVE/RECENTLY_DEAD/whatever about one tuple in the chain, then we refuse to treat *ALL* successor tuples in the same chain as DEAD -- even when HTSV says that they're DEAD directly. As long as these DEAD-to-HTSV heap-only tuples appear to be from the same original HOT chain, we don't need to change our mind about the HOT chain. The original structure of heap_prune_chain() from the HOT commit in 2007 had more or less the same code and structure as it has now, but almost everything was in a critical structure -- the state arrays and so on only came about a year later, in commit 6f10eb2111. The proposed bug fix more or less finishes the work of the second commit, which didn't go far enough. As long as we are starting by building a consistent picture of valid HOT chains on the page, and only later handle disconnected heap-only tuples, everything works out. -- Peter Geoghegan
В списке pgsql-bugs по дате отправления: