Re: relfrozenxid may disagree with row XIDs after 1ccc1e05ae
От | Robert Haas |
---|---|
Тема | Re: relfrozenxid may disagree with row XIDs after 1ccc1e05ae |
Дата | |
Msg-id | CA+TgmoYSM234TDJCyjAHch9igHP2tahXXENc8hBT+BHwcMkT8w@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: relfrozenxid may disagree with row XIDs after 1ccc1e05ae (Peter Geoghegan <pg@bowt.ie>) |
Список | pgsql-bugs |
On Fri, Mar 29, 2024 at 1:17 PM Peter Geoghegan <pg@bowt.ie> wrote: > FWIW I never thought that the order that we called > vacuum_get_cutoffs() relative to when we call GlobalVisTestFor() was > directly significant (though I did think that about the order that we > attain VACUUM's rel_pages and the vacuum_get_cutoffs() call). I can't > have thought that, because clearly GlobalVisTestFor() just returns a > pointer, and so cannot directly affect backend local state. Hmm, OK. > It was clear that this is an important issue, from an early stage. > Pre-release 14 had 2 or 3 bugs that all had the same symptom: > lazy_scan_prune would loop forever. This was true even though each of > the bugs had fairly different underlying causes (all tied to > dc7420c2c). I figured that there might well be more bugs like that in > the future. Looks like you were right. > I have every reason to believe that the remaining problems in this > area are extremely rare. I wonder if it would make sense to focus on > making the infinite loop behavior in lazy_scan_prune just throw an > error. > > I now fear that that'll be harder than one might think. At the time > that I added the looping behavior (in commit 8523492d), I believed > that the only "legitimate" reason that it could ever be needed was the > same reason why we needed the old tupgone behavior (to deal with > concurrently-inserted tuples from transactions that abort in flight). > But now I worry that it's actually protective, in some way that isn't > generally understood. And so it might be that converting the retry > into a hard error (e.g., erroring-out after MaxHeapTuplesPerPage > retries) will create new problems. It also sounds like it would boil down to "ERROR: our code sucks", so count me as not a fan of that approach. As much as I don't like the idea of significant changes to the back-branches, I think I like that idea even less. On the other hand, I also don't have an idea that I do like right now, so it's probably too early to decide anything just yet. I'll try to find more time to study this (and I hope others do the same). -- Robert Haas EDB: http://www.enterprisedb.com
В списке pgsql-bugs по дате отправления: