Re: [GENERAL] PANIC: heap_update_redo: no block
От | Tom Lane |
---|---|
Тема | Re: [GENERAL] PANIC: heap_update_redo: no block |
Дата | |
Msg-id | 4523.1143575061@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: [GENERAL] PANIC: heap_update_redo: no block (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-hackers |
I wrote: > * log_heap_update decides that it can set XLOG_HEAP_INIT_PAGE instead > of storing the full destination page, if the destination contains only > the single tuple being moved. This is fine, except it also resets the > buffer indicator for the *source* page, which is wrong --- that page > may still need to be re-generated from the xlog record. This is the > proximate cause of the bug report that started this thread. I have to retract that particular bit of analysis: I had misread the log_heap_update code. It seems to be doing the right thing, and in any case, given Alex's output LOG: REDO @ D/19176644; LSN D/191766A4: prev D/19176610; xid 81148979: Heap - move: rel 1663/16386/16559898; tid 1/1; new0/10 we can safely conclude that log_heap_update did not set the INIT_PAGE bit, because the "new" tid doesn't have offset=1. (The fact that the WAL_DEBUG printout doesn't report the bit's state is an oversight I plan to fix, but anyway we can be pretty sure it's not set here.) What we should be seeing, and don't see, is an indication of a backup block attached to this WAL record. Furthermore, I don't see any indication of a backup block attached to *any* of the WAL records in Alex's printout. The only conclusion I can draw is that he had full_page_writes turned OFF, and as we have just realized that that setting is completely unsafe, that is the explanation for his failure. > Clearly, we need to go through the xlog code with a fine tooth comb > and convince ourselves that all pages touched by any xlog record will > be properly reconstituted if they've later been truncated off. I have > not yet examined any of the code except the above. I've finished going through the xlog code looking for related problems, and AFAICS this is the score: * full_page_writes = OFF doesn't work. * btree_xlog_split and btree_xlog_delete_page should pass TRUE not FALSE to XLogReadBuffer for all pages that they are goingto re-initialize. * the recently-added gist xlog code is badly broken --- it pays no attention whatever to preventing torn pages :-(. It'snot going to be easy to fix, either, because the page split code assumes that a single WAL record can describe changesto any number of pages, which is not the case. Everything else seems to be getting it right. regards, tom lane
В списке pgsql-hackers по дате отправления: