Re: recovery is stuck when children are not processing SIGQUIT from previous crash
От | Tom Lane |
---|---|
Тема | Re: recovery is stuck when children are not processing SIGQUIT from previous crash |
Дата | |
Msg-id | 21890.1253714661@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | recovery is stuck when children are not processing SIGQUIT from previous crash (Peter Eisentraut <peter_e@gmx.net>) |
Ответы |
Re: recovery is stuck when children are not processing
SIGQUIT from previous crash
|
Список | pgsql-admin |
Peter Eisentraut <peter_e@gmx.net> writes: > I have observed the following situation a few times now (weeks or months > apart), most recently with 8.3.7. Some postgres child process crashes. > The postmaster notices and sends SIGQUIT to all other children. Once > all other children have exited, it would enter recovery. But for some > reason, some children are not processing the SIGQUIT signal and are > basically just stuck. That means the whole database system is then > stuck and won't continue without manual intervention. If I go in > manually and SIGKILL the offending processes, everything proceeds > normally, recovery finishes, and the system is up again. We need some investigation into why that is happening. > I haven't had the chance yet to analyze why the SIGQUIT signals are > getting stuck. Be that as it may, it appears there are no provisions > for this case. I couldn't find any documentation or previous reports on > this sort of thing. One might imagine a feature where the postmaster > resorts to throwing SIGKILLs around after a while, similar to how init > scripts are sometimes set up. I'd prefer not to go there, at least not without a demonstration that this will solve a bug that's unsolvable otherwise. If a child is really stuck in a state that doesn't accept SIGQUIT, it probably won't accept SIGKILL either (eg, uninterruptable disk wait). Or maybe we just have some errant code that is blocking SIGQUIT; but that's a garden variety bug IMO, not something that needs major new postmaster logic to work around. regards, tom lane
В списке pgsql-admin по дате отправления: