Re: Endless recovery
От | Heikki Linnakangas |
---|---|
Тема | Re: Endless recovery |
Дата | |
Msg-id | 47B014B0.3010400@enterprisedb.com обсуждение исходный текст |
Ответ на | Endless recovery (Hans-Juergen Schoenig <postgres@cybertec.at>) |
Ответы |
Re: Endless recovery
|
Список | pgsql-patches |
Hans-Juergen Schoenig wrote: > Last week we have seen a problem with some horribly configured machine. > The disk filled up (bad FSM ;) ) and once this happened the sysadmi killed the > system (-9). > After two days PostgreSQL has still not started up and they tried to restart it > again and again making sure that the consistency check was started over an over > again (thus causing more and more downtime). > From the admi point of view there was no way to find out whether the machine > was actually dead or still recovering. > > Here is a small patch which issues a log message indicating that the recovery > process can take ages. > Maybe this can prevent some admis from interrupting the recovery process. Wait, are you saying that the time was spent in the rm_cleanup phase? That sounds unbelievable. Surely the time was spent in the redo phase, no? > In our case, the recovery process took 3.5 days !! That's a ridiculously long time. Was this a normal recovery, not a PITR archive recovery? Any idea why the recovery took so long? Given the max. checkpoint timeout of 1h, I would expect that the recovery would take a maximum of few hours even with an extremely write-heavy workload. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
В списке pgsql-patches по дате отправления: