Re: 9.2 recovery/startup problems
От | Jeff Janes |
---|---|
Тема | Re: 9.2 recovery/startup problems |
Дата | |
Msg-id | CAMkU=1zAern2uby+fYveXrO-HY3cfS_uyv8SmBCMNipBXSOiUg@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: 9.2 recovery/startup problems (Robert Haas <robertmhaas@gmail.com>) |
Ответы |
Re: 9.2 recovery/startup problems
|
Список | pgsql-hackers |
On Tue, Dec 2, 2014 at 7:41 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Nov 26, 2014 at 7:13 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
> If I do a pg_ctl stop -mf, then both files go away. If I do a pg_ctl stop
> -mi, then neither goes away. It is only with the /sbin/reboot that I get
> the fatal combination of _init being gone but the other still present.
Eh? That sounds wonky.
I mean, reboot normally kills processes with SIGTERM or SIGKILL, in
which case I'd expect the outcome to match what you get with pg_ctl
stop -mf or pg_ctl stop -mi. The only way I can see that you'd get a
different behavior is if you did a hard reboot (like echo b >
/proc/sysrq-trigger); if that changes things, then we might have a
missing-fsync bug. How is that reboot managing to leave the main fork
behind while losing the init fork?
During abort processing after getting a SIGTERM, the back end truncates 59288 to zero size, and unlinks all the other files (including 59288_init). The actual removal of 59288 is left until the checkpoint. So if you SIGTERM the backend, then take down the server uncleanly before the next checkpoint completes, you are left with just 59288.
Here is the strace:
open("base/16416/59288", O_RDWR) = 8
ftruncate(8, 0) = 0
close(8) = 0
unlink("base/16416/59288.1") = -1 ENOENT (No such file or directory)
unlink("base/16416/59288_fsm") = -1 ENOENT (No such file or directory)
unlink("base/16416/59288_vm") = -1 ENOENT (No such file or directory)
unlink("base/16416/59288_init") = 0
unlink("base/16416/59288_init.1") = -1 ENOENT (No such file or directory)
Cheers,
Jeff
В списке pgsql-hackers по дате отправления: