Re: Hot standby 9.2.6 -> 9.2.6 PANIC: WAL contains references to invalid pages

Поиск
Список
Период
Сортировка
От Heikki Linnakangas
Тема Re: Hot standby 9.2.6 -> 9.2.6 PANIC: WAL contains references to invalid pages
Дата
Msg-id 52D45AD7.9080406@vmware.com
обсуждение исходный текст
Ответ на Re: Hot standby 9.2.6 -> 9.2.6 PANIC: WAL contains references to invalid pages  (Andres Freund <andres@2ndquadrant.com>)
Ответы Re: Hot standby 9.2.6 -> 9.2.6 PANIC: WAL contains references to invalid pages  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Hot standby 9.2.6 -> 9.2.6 PANIC: WAL contains references to invalid pages  (Andres Freund <andres@2ndquadrant.com>)
Список pgsql-bugs
On 01/13/2014 11:02 PM, Andres Freund wrote:
> On 2014-01-13 22:40:32 +0200, Heikki Linnakangas wrote:
>> With RBM_NORMAL_ZERO_OK, AFAICS we're talking about a tiny patch to
>> XLogReadBufferExtended. bufmgr.c doesn't need to do anything about the new
>> mode, as it's XLogReadBuffer that does the the check for zero pages. Per
>> attached patch (for demonstration purposes only, you also need to add the
>> new mode to the header file and adjust comments).
>
> I thought about that approach at first as well, but I am not so sure
> it's sufficient. Isn't it quite possible that we'd end up reading a page
> that was *partially* written during a crash and due to that has a
> corrupted checksum? There won't be any protection due to WAL replay/full
> page writes against that case here.

Good point. Normally, we expect the checksum to match on all pages that
we read during WAL replay, because full page writes will initialize any
page that is modified to an untorn state, before it's ever read. But we
can't rely on that in the extra read that btree_xlog_vacuum() does. It's
possible that there's a torn page on disk on block X, and we're
vacuuming page X + 1. The page will be fixed by a later record in the
WAL, before we reach consistency, but the ReadBuffer call from
btree_xlog_vacuum() will throw an error.

> Now, you could argue that that shouldn't be the case because we're only
> entering that codepath once STANDBY_SNAPSHOT_READY and you might be
> right...

I don't think that saves us. standbyMode can be STANDBY_SNAPSHOT_READY,
before we reach consistency. Adding a check for reachedConsistency,
though, ought to fix it.

- Heikki

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Hot standby 9.2.6 -> 9.2.6 PANIC: WAL contains references to invalid pages
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Hot standby 9.2.6 -> 9.2.6 PANIC: WAL contains references to invalid pages