Re: BUG #17245: Index corruption involving deduplicated entries
От | Peter Geoghegan |
---|---|
Тема | Re: BUG #17245: Index corruption involving deduplicated entries |
Дата | |
Msg-id | CAH2-WzkN5aESSLfK7-yrYgsXxYUi__VzG4XpZFwXm98LUtoWuQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: BUG #17245: Index corruption involving deduplicated entries (Andres Freund <andres@anarazel.de>) |
Ответы |
Re: BUG #17245: Index corruption involving deduplicated entries
Re: BUG #17245: Index corruption involving deduplicated entries |
Список | pgsql-bugs |
On Fri, Oct 29, 2021 at 9:12 PM Andres Freund <andres@anarazel.de> wrote: > It looks like a v14 issue. I can't reproduce it in 13, and I think a > change in > > commit b4af70cb210393c9c8f41643acf6b213e21178e7 > Author: Peter Geoghegan <pg@bowt.ie> > Date: 2021-04-05 13:21:44 -0700 > > Simplify state managed by VACUUM. > > which changed the logic for which relations are done in the leader: > without realizing that the "shared_indstats == NULL || " piece is > important to handle parallel-safe but too-small indexes correctly. This was a simple blunder on my part. I didn't intend to change anything about the behavior of parallel VACUUM, and so this commit was just about the only one of mine from Postgres 14 that I didn't consider as a possibility, even once. Mea culpa. Attached is a draft patch that fixes the problem. Also attached is a second patch. This adds assertions to heap_index_delete_tuples() to catch cases where a heap TID in an index points to an LP_UNUSED item in the heap (which is what this bug looked like, mostly). It also checks for certain more or less equivalent inconsistencies: the case where a heap TID in an index points to a line pointer that's past the end of the heap page's line pointer array, and the case where a heap TID in an index points directly to a heap-only tuple. Putting these new assertions in heap_index_delete_tuples() makes a lot of sense. The interlocking rules between an index and a heap relation are very simple here. They're also uniform across all supported index access methods (of those that support index tuple deletion, currently just nbtree, hash, and GiST). This is one of the very few code paths where there is a buffer lock (not just a pin) on the index page held throughout checking the heap; there is clearly no question of spurious assertion failures due to a concurrent vacuum. Note that the index deletion stuff I worked on in Postgres 14 batches-up TIDs in various ways, and so in practice heap_index_delete_tuples() will tend to pass over many of the TIDs from the index page, in many important cases (not so much with GiST and hash right now, but I expect that to change in the future). I want to once again thank Kamigishi Rei for her tireless work on this. If she didn't go to the trouble of providing us with so much help, then it would have probably taken significantly longer to figure out what was going on here. Also want to thank Andres. -- Peter Geoghegan
Вложения
В списке pgsql-bugs по дате отправления: