Re: Corruption during WAL replay
От | Masahiko Sawada |
---|---|
Тема | Re: Corruption during WAL replay |
Дата | |
Msg-id | CA+fd4k7mQrgbbem-7OT6pAMJV-SFconxOjU6V80dWXuyzovmLg@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Corruption during WAL replay (Teja Mupparti <tejeswarm@hotmail.com>) |
Список | pgsql-hackers |
On Wed, 15 Apr 2020 at 04:04, Teja Mupparti <tejeswarm@hotmail.com> wrote: > > Thanks Kyotaro and Masahiko for the feedback. I think there is a consensus on the critical-section around truncate, butI just want to emphasize the need for reversing the order of the dropping the buffers and the truncation. > > Repro details (when full page write = off) > > 1) Page on disk has empty LP 1, Insert into page LP 1 > 2) checkpoint START (Recovery REDO eventually starts here) > 3) Delete all rows on the page (page is empty now) > 4) Autovacuum kicks in and truncates the pages > DropRelFileNodeBuffers - Dirty page NOT written, LP 1 on disk still empty > 5) Checkpoint completes > 6) Crash > 7) smgrtruncate - Not reached (this is where we do the physical truncate) > > Now the crash-recovery starts > > Delete-log-replay (above step-3) reads page with empty LP 1 and the delete fails with PANIC (old page on diskwith no insert) > I agree that when replaying the deletion of (3) the page LP 1 is empty, but does that replay really fail with PANIC? I guess that we record that page into invalid_page_tab but don't raise a PANIC in this case. Regards, -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
В списке pgsql-hackers по дате отправления: