Re: unable to fail over to warm standby server
От | Fujii Masao |
---|---|
Тема | Re: unable to fail over to warm standby server |
Дата | |
Msg-id | 3f0b79eb1001290802p56af2093t10a77b82f36bc5bf@mail.gmail.com обсуждение исходный текст |
Ответ на | unable to fail over to warm standby server (Mason Hale <mason@onespot.com>) |
Ответы |
Re: unable to fail over to warm standby server
Re: unable to fail over to warm standby server |
Список | pgsql-bugs |
On Fri, Jan 29, 2010 at 11:49 PM, Mason Hale <mason@onespot.com> wrote: > While I did not remove the trigger file, I did rename recovery.conf to > recovery.conf.old. > That file contained the recovery_command configuration that identified the > trigger file. So that rename should have eliminated the problem. But it > didn't. Even after making this change and taking the trigger file out of the > equation my database failed to come online. Renaming of the recovery.conf doesn't resolve the problem at all. Instead, the sysadmin had to remove only the trigger file with a wrong permission and just restart postgres. >> 9.) The server did not come up (again). This time the contents of the >> new postgresql.log file were: >> >> [postgres@prod-db-2 pg_log]$ tail -n 100 postgresql-2010-01-18_211132.log >> 2010-01-18 21:11:32 UTC ()LOG: database system was interrupted while in recovery at log time 2010-01-18 20:10:59 UTC >> 2010-01-18 21:11:32 UTC ()HINT: If this has occurred more than once some data might be corrupted and you might need tochoose an earlier recovery target. >> 2010-01-18 21:11:32 UTC ()LOG: could not open file "pg_xlog/0000000200003C82000000A3" (log file 15490, segment 163):No such file or directory >> 2010-01-18 21:11:32 UTC ()LOG: invalid primary checkpoint record >> 2010-01-18 21:11:32 UTC ()LOG: could not open file "pg_xlog/0000000200003C8200000049" (log file 15490, segment 73): Nosuch file or directory >> 2010-01-18 21:11:32 UTC ()LOG: invalid secondary checkpoint record >> 2010-01-18 21:11:32 UTC ()PANIC: could not locate a valid checkpoint record >> 2010-01-18 21:11:32 UTC ()LOG: startup process (PID 9328) was terminated by signal 6: Aborted >> 2010-01-18 21:11:32 UTC ()LOG: aborting startup due to startup process failure You seem to focus on the above trouble. I think that this happened because recovery.conf was deleted and restore_command was not given. In fact, the WAL file (e.g., pg_xlog/0000000200003C82000000A3) required for recovery was unable to be restored from the archive because restore_command was not supplied. Then recovery failed. If the sysadmin had left the recovery.conf and removed the trigger file, pg_standby in restore_command would have restored all WAL files required for recovery, and recovery would advance well. Hope this helps. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
В списке pgsql-bugs по дате отправления: