Re: BUG #16582: Logical index corruption leading to apparent index scan infinite loop
От | James Lucas |
---|---|
Тема | Re: BUG #16582: Logical index corruption leading to apparent index scan infinite loop |
Дата | |
Msg-id | CAAFmbbOnCtds-Q5vOAmTMBm5sAvBpQhc474zq+LMCidSjgt11A@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: BUG #16582: Logical index corruption leading to apparent index scan infinite loop (James Lucas <jlucasdba@gmail.com>) |
Список | pgsql-bugs |
Forgot to say, I don't think I can run bt_index_parent_check() right now due to the broader locks required. I will try to get a run in if I get an opportunity. Thanks, James On Mon, Aug 17, 2020 at 10:51 AM James Lucas <jlucasdba@gmail.com> wrote: > > Hi Peter, > > I re-ran with DEBUG2 messages enabled. Got a bunch of output, but the > last few lines are like this for each index: > > DEBUG: level 965868789 leftmost page of index "xxxxx" was found > deleted or half dead > DETAIL: Deleted page found when building scankey from right sibling. > DEBUG: level 966240004 leftmost page of index "xxxxx" was found > deleted or half dead > DETAIL: Deleted page found when building scankey from right sibling. > ERROR: cross page item order invariant violated for index "xxxxx" > DETAIL: Last item on page tid=(xx,xx) page lsn=xxxxxxxxxx > > DEBUG: level 967745369 leftmost page of index "xxxxx" was found > deleted or half dead > DETAIL: Deleted page found when building scankey from right sibling. > DEBUG: level 967746918 leftmost page of index "xxxxx" was found > deleted or half dead > DETAIL: Deleted page found when building scankey from right sibling. > ERROR: cross page item order invariant violated for index "xxxxx" > DETAIL: Last item on page tid=(xx,xx) page lsn=xxxxxxxxxx > > > Not sure if pageinspect might be able to tell anything else useful? > I'd like to find the root cause of the corruption if possible, so this > doesn't happen in other databases. > > Also wanted to see if it might be a good idea to add a > CHECK_FOR_INTERRUPTS call to _bt_moveright() so if this does happen > again, at least the session would be killable. I don't have enough > background in the code to know where it's safe to add, or I'd submit a > patch. > > Thanks, > James > > On Fri, Aug 14, 2020 at 4:33 PM Peter Geoghegan <pg@bowt.ie> wrote: > > > > On Fri, Aug 14, 2020 at 2:03 PM PG Bug reporting form > > <noreply@postgresql.org> wrote: > > > The table has two indexes, so I decided to scan both indexes on all > > > partitions with the bt_index_check function from the amcheck extension. I > > > identified one partition where both indexes throw the following result: > > > ERROR: cross page item order invariant violated for index "xxxxx" > > > DETAIL: Last item on page tid(xx,xx) page lsn=xxxxxxxxxx > > > > This sounds very much like an index with sibling pages that are in the > > wrong order relative to each other. That's totally consistent with > > what you describe with _bt_moveright() -- circular sibling links can > > cause it to just keep going. > > > > It's possible that you'll get a better error with > > bt_index_parent_check(), which might be worth trying. But it probably > > won't give you any additional information. > > > > Note that there is DEBUG1 and DEBUG2 output from amcheck, which might > > give you a few more details. You can "set client_min_messages = > > 'debug2'" in an interactive session that runs bt_index_check() to see > > some additional context. Again, this is unlikely to make all that much > > difference. > > > > -- > > Peter Geoghegan
В списке pgsql-bugs по дате отправления: