Re: Corruption during WAL replay
От | Masahiko Sawada |
---|---|
Тема | Re: Corruption during WAL replay |
Дата | |
Msg-id | CA+fd4k43sPrf3sRZmUHpDErHPfkbfa3rhQ9Y98jTCUzj9Y1VaQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Corruption during WAL replay (Teja Mupparti <tejeswarm@hotmail.com>) |
Ответы |
Re: Corruption during WAL replay
(Andres Freund <andres@anarazel.de>)
|
Список | pgsql-hackers |
On Sat, 11 Apr 2020 at 09:00, Teja Mupparti <tejeswarm@hotmail.com> wrote: > > Thanks Andres and Kyotaro for the quick review. I have fixed the typos and also included the critical section (emulatedit with try-catch block since palloc()s are causing issues in the truncate code). This time I used git format-patch. > I briefly looked at the latest patch but I'm not sure it's the right thing here to use PG_TRY/PG_CATCH to report the PANIC error. For example, with the following code you changed, we will always end up with emitting a PANIC "failed to truncate the relation" regardless of the actual cause of the error. + PG_CATCH(); + { + ereport(PANIC, (errcode(ERRCODE_INTERNAL_ERROR), + errmsg("failed to truncate the relation"))); + } + PG_END_TRY(); And the comments of RelationTruncate() mentions: /* * We WAL-log the truncation before actually truncating, which means * trouble if the truncation fails. If we then crash, the WAL replay * likely isn't going to succeed in the truncation either, and cause a * PANIC. It's tempting to put a critical section here, but that cure * would be worse than the disease. It would turn a usually harmless * failure to truncate, that might spell trouble at WAL replay, into a * certain PANIC. */ As a second idea, I wonder if we can defer truncation until commit time like smgrDoPendingDeletes mechanism. The sequence would be: At RelationTruncate(), 1. WAL logging. 2. Remember buffers to be dropped. At CommitTransaction(), 3. Revisit the remembered buffers to check if the buffer still has table data that needs to be truncated. 4-a, If it has, we mark it as IO_IN_PROGRESS. 4-b, If it already has different table data, ignore it. 5, Truncate physical files. 6, Mark the buffer we marked at #4-a as invalid. If an error occurs between #3 and #6 or in abort case, we revert all IO_IN_PROGRESS flags on the buffers. In the above idea, remembering all buffers having to-be-truncated table at RelationTruncate(), we reduce the time for checking buffers at the commit time. Since we acquire AccessExclusiveLock the number of buffers having to-be-truncated table's data never increases. A downside would be that since we can truncate multiple relations we need to remember all buffers of each truncated relations, which is up to (sizeof(int) * NBuffers) in total. Regards, -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
В списке pgsql-hackers по дате отправления:
Следующее
От: Kyotaro HoriguchiДата:
Сообщение: Re: Race condition in SyncRepGetSyncStandbysPriority