Re: BUG #17386: btree index corruption after reindex concurrently on write heavy table
От | Maxim Boguk |
---|---|
Тема | Re: BUG #17386: btree index corruption after reindex concurrently on write heavy table |
Дата | |
Msg-id | CAK-MWwQ7fboHz_s8aTn08i0W2aNH6QoELeWwmH-y80tOfsdDBQ@mail.gmail.com обсуждение исходный текст |
Ответы |
Re: BUG #17386: btree index corruption after reindex concurrently on write heavy table
|
Список | pgsql-bugs |
On Fri, Jan 28, 2022 at 3:42 PM PG Bug reporting form <noreply@postgresql.org> wrote:
The following bug has been logged on the website:
Bug reference: 17386
Logged by: Maxim Boguk
Email address: maxim.boguk@gmail.com
PostgreSQL version: 14.1
Operating system: Ubuntu Linux
Description:
Hi,
I found multiple cases of Btree index corruption after REINDEX someindex
CONCURRENTLY;.
What I found so far:
1)corruption exists in master, on all replicas and test server recovered
from base backup + wal archive (so it isn't some local hardware error)
2)it doesn't happen every time but at least two cases of corrupted indexes
found after database verification with amcheck.
3)it happen on huge and write heavy table
Attempt fix situation with REINDEX INDEX jobs_pkey CONCURRENTLY;
lead to corrupted index again in different tuple:
ERROR: heap tuple (69306318,15) from table "jobs" lacks matching index tuple within index "jobs_pkey"
SELECT * FROM heap_page_item_attrs(get_raw_page('jobs', 69306318), 'jobs'::regclass) where lp=15;
-[ RECORD 1 ]---
lp | 15
lp_off | 24
lp_flags | 2
lp_len | 0
-[ RECORD 1 ]---
lp | 15
lp_off | 24
lp_flags | 2
lp_len | 0
select ctid,id,updated_at,created_at from jobs where ctid='(69306318,24)'::tid;
-[ RECORD 1 ]--------------------------
ctid | (69306318,24)
id | 26192320674
updated_at | 2022-01-28 13:08:27.224275
created_at | 2021-11-17 14:35:59.441979
-[ RECORD 1 ]--------------------------
ctid | (69306318,24)
id | 26192320674
updated_at | 2022-01-28 13:08:27.224275
created_at | 2021-11-17 14:35:59.441979
And again updated_at happens during the REINDEX run.
select ctid,id,updated_at,created_at from jobs where id=26192320674;
(0 rows)
(0 rows)
SELECT t_ctid, raw_flags, combined_flags FROM heap_page_item_attrs(get_raw_page('jobs', 69306318), 'jobs'::regclass), LATERAL heap_tuple_infomask_flags(t_infomask, t_infomask2) where lp=24;
-[ RECORD 1 ]--+---------------------------------------------------------------------------------------------------------------------------------------------
t_ctid | (69306318,24)
raw_flags | {HEAP_HASNULL,HEAP_HASVARWIDTH,HEAP_XMAX_KEYSHR_LOCK,HEAP_XMAX_LOCK_ONLY,HEAP_XMIN_COMMITTED,HEAP_XMAX_INVALID,HEAP_UPDATED,HEAP_ONLY_TUPLE}
combined_flags | {}
-[ RECORD 1 ]--+---------------------------------------------------------------------------------------------------------------------------------------------
t_ctid | (69306318,24)
raw_flags | {HEAP_HASNULL,HEAP_HASVARWIDTH,HEAP_XMAX_KEYSHR_LOCK,HEAP_XMAX_LOCK_ONLY,HEAP_XMIN_COMMITTED,HEAP_XMAX_INVALID,HEAP_UPDATED,HEAP_ONLY_TUPLE}
combined_flags | {}
Seems something broken in combination of HOT update and REINDEX CONCURRENTLY.
--
Maxim Boguk
Senior Postgresql DBA
https://dataegret.com/
Phone RU: +7 985 433 0000
Phone UA: +380 99 143 0000
Phone AU: +61 45 218 5678
LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b
Skype: maxim.boguk
"Доктор, вы мне советовали так не делать, но почему мне по-прежнему больно когда я так делаю ещё раз?"
Senior Postgresql DBA
https://dataegret.com/
Phone RU: +7 985 433 0000
Phone UA: +380 99 143 0000
Phone AU: +61 45 218 5678
LinkedIn: http://www.linkedin.com/pub/maksym-boguk/80/b99/b1b
Skype: maxim.boguk
"Доктор, вы мне советовали так не делать, но почему мне по-прежнему больно когда я так делаю ещё раз?"
В списке pgsql-bugs по дате отправления: