Re: BUG #17928: Standby fails to decode WAL on termination of primary
От | Noah Misch |
---|---|
Тема | Re: BUG #17928: Standby fails to decode WAL on termination of primary |
Дата | |
Msg-id | 20230813033034.GD2326466@rfd.leadboat.com обсуждение исходный текст |
Ответ на | Re: BUG #17928: Standby fails to decode WAL on termination of primary (Thomas Munro <thomas.munro@gmail.com>) |
Ответы |
Re: BUG #17928: Standby fails to decode WAL on termination of primary
|
Список | pgsql-bugs |
On Sun, Aug 13, 2023 at 03:12:34PM +1200, Thomas Munro wrote: > On Sun, Aug 13, 2023 at 9:13 AM Noah Misch <noah@leadboat.com> wrote: > > Any user could call pg_logical_emit_message() to silently terminate the WAL > > stream, which is far worse than the original bug. So far, I'm seeing one way > > to clearly fix $SUBJECT without that harm. When a record header spans a page > > boundary, read the next page to reassemble the header. If > > !ValidXLogRecordHeader() (invalid xl_rmid or xl_prev), treat as end of WAL. > > Otherwise, read the whole record in chunks, calculating CRC. If CRC is > > invalid, treat as end of WAL. Otherwise, ereport(FATAL) for the oversized > > record. A side benefit would be avoiding useless large allocations (1GB > > backend, 4GB frontend) as discussed upthread. May as well do the xl_rmid and > > xl_prev checks in all branches, to avoid needless XLogRecordMaxSize-1 > > allocations. Thoughts? For invalid-length records in v16+, since every such > > record is end-of-wal or corruption, those versions could skip the CRC. > > That sounds quite strong. But... why couldn't the existing > xlp_rem_len cross-check protect us from this failure mode? If we > could defer the allocation until after that check (and the usual > ValidXLogRecordHeader() check), I think we'd know that we're really > looking at a size that was actually written in both pages (which must > also have survived xlp_pageaddr check), no? Hmm, I think that is right. A coincidental match of the 32-bit CRC is more probable than having all those fields appear valid by coincidence, especially xlp_pageaddr.
В списке pgsql-bugs по дате отправления: