On Fri, Jul 20, 2018 at 1:54 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote:
On 19/07/18 23:13, Andres Freund wrote: > On 2018-07-19 14:54:52 -0500, Nico Williams wrote: >> On Thu, Jul 19, 2018 at 12:20:53PM -0700, Andres Freund wrote: >>> On 2018-07-19 11:57:25 +0300, Heikki Linnakangas wrote: >>>> Ugh. Yeah, in wal_quickdie, and other aux process *_quickdie() handlers, I >>>> agree we should just _exit(2). All we want to do is to exit the process >>>> immediately. >>> >>> Indeed. >>> >>>> bgworker_quickdie() makes some effort to block SIGQUIT during the exit() >>>> processing, but that doesn't solve the whole problem. The process could've >>>> been in the middle of a malloc/free when the signal arrived, for example. >>>> exit() is simply not safe to call from a signal handler. >>> >>> Yea. I really don't understand why we have't been able to agree on this >>> for *years* now. >> >> I mean, only calling async-signal-safe functions from signal handlers is >> a critical POSIX requirement, and exit(3) is NOT async-signal-safe. > > There's a few standard requirements that aren't particularly relevant in > practice (including some functions not being listed as signal > safe). Just arguing from that isn't really helpful. But there's plenty > hard evidence, including a few messages upthread!, of it being > practically problematic. Just looking at the reasoning at why exit (and > malloc) aren't signal safe however, makes it clear that this particular > restriction should be adhered to, however.
ISTM that no-one has any great ideas on what to do about the ereport() in quickdie(). But I think we have consensus on replacing the exit(2) calls with _exit(2). If we do just that, it would be better than the status quo, even if it doesn't completely fix the problem. This would prevent the case that Asim reported, for starters.
Any objections to the attached?
With _exit(), I think we wouldn't really need the "PG_SETMASK(&BlockSig);" calls in the aux process *_quickdie handlers that don't do anything else than call _exit(). But I didn't dare to remove them yet.
Seems there is opportunity to actuall consolidate all those quickdies as well. As I see no difference between the auxiliary process quickdie implementations. Only backend `quickdie()` has special code for ereport and all.