Re: Truncation failure in autovacuum results in data corruption (duplicate keys)
От | Tom Lane |
---|---|
Тема | Re: Truncation failure in autovacuum results in data corruption (duplicate keys) |
Дата | |
Msg-id | 15690.1524084557@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: Truncation failure in autovacuum results in data corruption (duplicate keys) (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: Truncation failure in autovacuum results in data corruption(duplicate keys)
Re: Truncation failure in autovacuum results in data corruption(duplicate keys) |
Список | pgsql-hackers |
I wrote: > Relation truncation throws away the page image in memory without ever > writing it to disk. Then, if the subsequent file truncate step fails, > we have a problem, because anyone who goes looking for that page will > fetch it afresh from disk and see the tuples as live. > There are WAL entries recording the row deletions, but that doesn't > help unless we crash and replay the WAL. > It's hard to see a way around this that isn't fairly catastrophic for > performance :-(. Just to throw out a possibly-crazy idea: maybe we could fix this by PANIC'ing if truncation fails, so that we replay the row deletions from WAL. Obviously this would be intolerable if the case were frequent, but we've had only two such complaints in the last nine years, so maybe it's tolerable. It seems more attractive than taking a large performance hit on truncation speed in normal cases, anyway. A gotcha to be concerned about is what happens if we replay from WAL, come to the XLOG_SMGR_TRUNCATE WAL record, and get the same truncation failure again, which is surely not unlikely. PANIC'ing again will not do. I think we could probably handle that by having the replay code path zero out all the pages it was unable to delete; as long as that succeeds, we can call it good and move on. Or maybe just do that in the mainline case too? That is, if ftruncate fails, handle it by zeroing the undeletable pages and pressing on? > But in any case it's wrapped up in order-of-operations > issues. I've long since forgotten the details, but I seem to have thought > that there were additional order-of-operations hazards besides this one. It'd be a good idea to redo that investigation before concluding this issue is fixed, too. I was not thinking at the time that it'd be years before anybody did anything, or I'd have made more notes. regards, tom lane
В списке pgsql-hackers по дате отправления: