Fail hard if xlogreader.c fails on out-of-memory
От | Michael Paquier |
---|---|
Тема | Fail hard if xlogreader.c fails on out-of-memory |
Дата | |
Msg-id | ZRKKdI5-RRlta3aF@paquier.xyz обсуждение исходный текст |
Ответы |
Re: Fail hard if xlogreader.c fails on out-of-memory
|
Список | pgsql-hackers |
Hi all, (Thomas in CC.) Now that becfbdd6c1c9 has improved the situation to detect the difference between out-of-memory and invalid WAL data in WAL, I guess that it is time to tackle the problem of what we should do when reading WAL records bit fail on out-of-memory. To summarize, currently the WAL reader APIs fail the same way if we detect some incorrect WAL record or if a memory allocation fails: an error is generated and returned back to the caller to consume. For WAL replay, not being able to make the difference between an OOM and the end-of-wal is a problem in some cases. For example, in crash recovery, failing an internal allocation will be detected as the end-of-wal, causing recovery to stop prematurely. In the worst cases, this silently corrupts clusters because not all the records generated in the local pg_wal/ have been replayed. Oops. When in standby mode, things are a bit better, because we'd just loop and wait for the next record. But, even in this case, if the startup process does a crash recovery while standby is set up, we may finish by attempting recovery from a different source than the local pg_wal/. Not strictly critical, but less optimal in some cases as we could switch to archive recovery earlier than necessary. In a different thread, I have proposed to extend the WAL reader facility so as an error code is returned to make the difference between an OOM or the end of WAL with an incorrect record: https://www.postgresql.org/message-id/ZRJ-p1dLUY0uoChc%40paquier.xyz However this requires some ABI changes, so that's not backpatchable. This leaves out what we can do for the existing back-branches, and one option is to do the simplest thing I can think of: if an allocation fails, just fail *hard*. The allocations of the WAL reader rely on palloc_extended(), so I'd like to suggest that we switch to palloc() instead. If we do so, an ERROR is promoted to a FATAL during WAL replay, which makes sure that we will never stop recovery earlier than we should, FATAL-ing before things go wrong. Note that the WAL prefetching already relies on a palloc() on HEAD in XLogReadRecordAlloc(), which would fail hard the same way on OOM. So, attached is a proposal of patch to do something down to 12. Thoughts and/or comments are welcome. -- Michael
Вложения
В списке pgsql-hackers по дате отправления: