Re: warning message in standby
От | Heikki Linnakangas |
---|---|
Тема | Re: warning message in standby |
Дата | |
Msg-id | 4C122CDB.70601@enterprisedb.com обсуждение исходный текст |
Ответ на | Re: warning message in standby (Fujii Masao <masao.fujii@gmail.com>) |
Ответы |
Re: warning message in standby
|
Список | pgsql-hackers |
On 11/06/10 07:18, Fujii Masao wrote: > On Fri, Jun 11, 2010 at 1:01 AM, Heikki Linnakangas > <heikki.linnakangas@enterprisedb.com> wrote: >> We're talking about a corrupt record (incorrect CRC, incorrect backlink >> etc.), not errors within redo functions. During crash recovery, a corrupt >> record means you've reached end of WAL. In standby mode, when streaming WAL >> from master, that shouldn't happen, and it's not clear what to do if it >> does. PANIC is not a good idea, at least if the server uses hot standby, >> because that only makes the situation worse from availability point of view. >> So we log the error as a WARNING, and keep retrying. It's unlikely that the >> problem will just go away, but we keep retrying anyway in the hope that it >> does. However, it seems that we're too aggressive with the retries. > > Right. The attached patch calms down the retries: if we found an invalid > record while streaming WAL from master, we sleep for 5 seconds (needs to > be reduced?) before retrying to replay the record which is in the same > location where the invalid one was found. Comments? Hmm, right now it doesn't even reconnect when it sees a corrupt record streamed from the master. It's really pointless to retry in that case, reapplying the exact same piece of WAL surely won't work. I think it should disconnect, and then retry reading from archive and pg_xlog, and then retry streaming again. That's pretty hopeless too, but it's at least theoretically possible that something went wrong in the transmission and the file in the archive is fine. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Вложения
В списке pgsql-hackers по дате отправления: