Re: corrupt pages detected by enabling checksums
От | Jim Nasby |
---|---|
Тема | Re: corrupt pages detected by enabling checksums |
Дата | |
Msg-id | 518AD80D.1060904@nasby.net обсуждение исходный текст |
Ответ на | Re: corrupt pages detected by enabling checksums (Jeff Davis <pgsql@j-davis.com>) |
Ответы |
Re: corrupt pages detected by enabling checksums
|
Список | pgsql-hackers |
On 4/5/13 6:39 PM, Jeff Davis wrote: > On Fri, 2013-04-05 at 10:34 +0200, Florian Pflug wrote: >> Maybe we could scan forward to check whether a corrupted WAL record is >> followed by one or more valid ones with sensible LSNs. If it is, >> chances are high that we haven't actually hit the end of the WAL. In >> that case, we could either log a warning, or (better, probably) abort >> crash recovery. > > +1. > >> Corruption of fields which we require to scan past the record would >> cause false negatives, i.e. no trigger an error even though we do >> abort recovery mid-way through. There's a risk of false positives too, >> but they require quite specific orderings of writes and thus seem >> rather unlikely. (AFAICS, the OS would have to write some parts of >> record N followed by the whole of record N+1 and then crash to cause a >> false positive). > > Does the xlp_pageaddr help solve this? > > Also, we'd need to be a little careful when written-but-not-flushed WAL > data makes it to disk, which could cause a false positive and may be a > fairly common case. Apologies if this is a stupid question, but is this mostly an issue due to torn pages? IOW, if we had a way to ensure wenever see torn pages, would that mean an invalid CRC on a WAL page indicated there really was corruption on that page? Maybe it's worth putting (yet more) thought into the torn page issue... :/ -- Jim C. Nasby, Data Architect jim@nasby.net 512.569.9461 (cell) http://jim.nasby.net
В списке pgsql-hackers по дате отправления: