Re: Behavior for crash recovery when it detects a corrupt WAL record
От | Heikki Linnakangas |
---|---|
Тема | Re: Behavior for crash recovery when it detects a corrupt WAL record |
Дата | |
Msg-id | 50758B45.9000901@vmware.com обсуждение исходный текст |
Ответ на | Re: Behavior for crash recovery when it detects a corrupt WAL record (Amit Kapila <amit.kapila@huawei.com>) |
Список | pgsql-hackers |
On 10.10.2012 17:37, Amit Kapila wrote: > On Tuesday, October 09, 2012 7:38 PM Heikki Linnakangas wrote: >> We rely on the CRC to detect end of WAL during recovery. If the >> system crashes while the WAL is being flushed to disk, it's normal that >> there's a corrupt (ie. partially written) record at the end of the WAL. >> This is a common technique used by pretty much every system with a >> transaction log / journal. > > Yeah, Can't we check if there is a next valid page, then it can be > derived that current page has some corruption and not a partial page > write problem. No. The OS or disk controller can flush the pages out-of-order, so on recovery, it's entirely possible that the next page is valid even if the previous one is not. BTW, this means that the CRC on WAL records can *not* be used to detect random corruption of the WAL, because if will be confused with end-of-WAL. I don't think many people realize that. You will have to use a filesystem with checksums if you want to detect random bit errors etc. in the WAL. In crash recovery, anyway; in archive recovery or replication you can make more assumptions. - Heikki
В списке pgsql-hackers по дате отправления: