Обсуждение: kevent latch paths don't handle postmaster death well

Поиск
Список
Период
Сортировка

kevent latch paths don't handle postmaster death well

От
Tom Lane
Дата:
I noticed that chipmunk failed [1] with a rather interesting log:

2020-10-14 08:57:01.661 EEST [27048:6] pg_regress/prepared_xacts LOG:  statement: UPDATE pxtest1 SET foobar = 'bbb'
WHEREfoobar = 'aaa'; 
2020-10-14 08:57:01.721 EEST [27048:7] pg_regress/prepared_xacts LOG:  statement: SELECT * FROM pxtest1;
2020-10-14 08:57:01.823 EEST [27048:8] pg_regress/prepared_xacts FATAL:  postmaster exited during a parallel
transaction
TRAP: FailedAssertion("entry->trans == NULL", File: "pgstat.c", Line: 909, PID: 27048)
2020-10-14 08:57:01.861 EEST [27051:1] ERROR:  could not attach to dynamic shared area
2020-10-14 08:57:01.861 EEST [27051:2] STATEMENT:  SELECT * FROM pxtest1;

I do not know what happened to the postmaster, but seeing that chipmunk
is a very small machine running a pretty old Linux kernel, it's plausible
to guess that the OOM killer decided to pick on the postmaster.  (I wonder
whether Heikki has taken any steps to prevent that on that machine.)
My concern today is not that the postmaster died, but that the subsequent
response was an Assert failure.  Not good.

I tried to reproduce this by dint of manually kill -9'ing the postmaster
during the select_parallel regression test.  Figuring that a slower
machine would give me a better chance of success at that, I used an old
Mac laptop that wasn't doing anything else.  I did not succeed yet, but
what I did reproducibly get (in five out of five tries) was a leader
process that was permanently stuck on a latch, waiting for its worker(s)
to die.  Needless to say, there were no workers and never would be.
The stack trace varied a bit, but here's an interesting case:

(gdb) bt
#0  0x90b267ac in kevent ()
#1  0x003b76e8 in WaitEventSetWaitBlock [inlined] () at latch.c:1506
#2  0x003b76e8 in WaitEventSetWait (set=0x10003a8, timeout=-1, occurred_events=<value temporarily unavailable, due to
optimizations>,nevents=1, wait_event_info=<value temporarily unavailable, due to optimizations>) at latch.c:1309 
#3  0x003b814c in WaitLatch (latch=<value temporarily unavailable, due to optimizations>, wakeEvents=17, timeout=-1,
wait_event_info=134217729)at latch.c:411 
#4  0x0032e77c in WaitForBackgroundWorkerShutdown (handle=0x10280a4) at bgworker.c:1139
#5  0x000bc6fc in WaitForParallelWorkersToExit (pcxt=0xc6f84c) at parallel.c:876
#6  0x000bc99c in DestroyParallelContext (pcxt=0xc6f84c) at parallel.c:958
#7  0x000bdc48 in dlist_is_empty [inlined] () at lib/ilist.h:1231
#8  0x000bdc48 in AtEOXact_Parallel (isCommit=4 '\004') at parallel.c:1224
#9  0x000ccf24 in AbortTransaction () at xact.c:2702
#10 0x000cd534 in AbortOutOfAnyTransaction () at xact.c:4623
#11 0x00550b54 in ShutdownPostgres (code=<value temporarily unavailable, due to optimizations>, arg=<value temporarily
unavailable,due to optimizations>) at postinit.c:1195 
#12 0x003b5ff0 in shmem_exit (code=1) at ipc.c:239
#13 0x003b6168 in proc_exit_prepare (code=1) at ipc.c:194
#14 0x003b6240 in proc_exit (code=1) at ipc.c:107
#15 0x00541f6c in errfinish (filename=<value temporarily unavailable, due to optimizations>, lineno=<value temporarily
unavailable,due to optimizations>, funcname=0x5bd55c "WaitForParallelWorkersToExit") at elog.c:578 
#16 0x000bc748 in WaitForParallelWorkersToExit (pcxt=0xc6f84c) at parallel.c:885
#17 0x000bc7c8 in ReinitializeParallelDSM (pcxt=0xc6f84c) at parallel.c:471
#18 0x0021e468 in ExecParallelReinitialize (planstate=0x109a9a0, pei=0xc3a09c, sendParams=0x0) at execParallel.c:906
#19 0x00239f6c in ExecGather (pstate=0x109a848) at nodeGather.c:177
#20 0x00221cd8 in ExecProcNodeInstr (node=0x109a848) at execProcnode.c:466
#21 0x0024f7a8 in ExecNestLoop (pstate=0x1099700) at executor/executor.h:244
#22 0x00221cd8 in ExecProcNodeInstr (node=0x1099700) at execProcnode.c:466
#23 0x0022edbc in ExecProcNode [inlined] () at executor.h:244
#24 0x0022edbc in fetch_input_tuple (aggstate=0x10993d0) at executor.h:589

We appear to have already realized that the postmaster died, since we're
inside proc_exit.  WaitForBackgroundWorkerShutdown is doing this:

        rc = WaitLatch(MyLatch,
                       WL_LATCH_SET | WL_POSTMASTER_DEATH, 0,
                       WAIT_EVENT_BGWORKER_SHUTDOWN);

which one would certainly hope would not block at all if the postmaster
is already dead, yet it's doing so.  I guess that the kevent stuff is
failing to handle the case of another WaitLatch call after the postmaster
is already known dead.

In case it helps, I checked the contents of the WaitEventSet:

(gdb) p *LatchWaitSet
$2 = {
  nevents = 2,
  nevents_space = 2,
  events = 0x10003cc,
  latch = 0xa1b673c,
  latch_pos = 0,
  exit_on_postmaster_death = 0 '\0',
  kqueue_fd = 12,
  kqueue_ret_events = 0x10003ec,
  report_postmaster_not_running = 0 '\0'
}
(gdb) p LatchWaitSet->events[0]
$3 = {
  pos = 0,
  events = 1,
  fd = 10,
  user_data = 0x0
}
(gdb) p LatchWaitSet->events[1]
$4 = {
  pos = 1,
  events = 16,
  fd = 3,
  user_data = 0x0
}


I thought possibly this was an ancient-macOS problem, but I've now
reproduced substantially the same behavior on an up-to-date Catalina
machine (10.15.7), so I do not think we can write it off that way.

            regards, tom lane

[1] https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=chipmunk&dt=2020-10-14%2000%3A04%3A08



Re: kevent latch paths don't handle postmaster death well

От
Thomas Munro
Дата:
On Thu, Oct 15, 2020 at 7:58 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> We appear to have already realized that the postmaster died, since we're
> inside proc_exit.  WaitForBackgroundWorkerShutdown is doing this:
>
>         rc = WaitLatch(MyLatch,
>                        WL_LATCH_SET | WL_POSTMASTER_DEATH, 0,
>                        WAIT_EVENT_BGWORKER_SHUTDOWN);
>
> which one would certainly hope would not block at all if the postmaster
> is already dead, yet it's doing so.  I guess that the kevent stuff is
> failing to handle the case of another WaitLatch call after the postmaster
> is already known dead.

The process exit event is like an 'edge', not a 'level'... hmm.  It
might be enough to set report_postmaster_not_running = true the first
time it tells us so if we try to wait again we'll treat it like a
level.  I will look into it later today.



Re: kevent latch paths don't handle postmaster death well

От
Tom Lane
Дата:
Thomas Munro <thomas.munro@gmail.com> writes:
> The process exit event is like an 'edge', not a 'level'... hmm.  It
> might be enough to set report_postmaster_not_running = true the first
> time it tells us so if we try to wait again we'll treat it like a
> level.  I will look into it later today.

Seems like having that be per-WaitEventSet state is also not a great
idea --- if we detect PM death while waiting on one WES, and then
wait on another one, it won't work.  A plain process-wide static
variable would be a better way I bet.

            regards, tom lane



Re: kevent latch paths don't handle postmaster death well

От
Heikki Linnakangas
Дата:
On 14/10/2020 21:58, Tom Lane wrote:
> I noticed that chipmunk failed [1] with a rather interesting log:
> 
> 2020-10-14 08:57:01.661 EEST [27048:6] pg_regress/prepared_xacts LOG:  statement: UPDATE pxtest1 SET foobar = 'bbb'
WHEREfoobar = 'aaa';
 
> 2020-10-14 08:57:01.721 EEST [27048:7] pg_regress/prepared_xacts LOG:  statement: SELECT * FROM pxtest1;
> 2020-10-14 08:57:01.823 EEST [27048:8] pg_regress/prepared_xacts FATAL:  postmaster exited during a parallel
transaction
> TRAP: FailedAssertion("entry->trans == NULL", File: "pgstat.c", Line: 909, PID: 27048)
> 2020-10-14 08:57:01.861 EEST [27051:1] ERROR:  could not attach to dynamic shared area
> 2020-10-14 08:57:01.861 EEST [27051:2] STATEMENT:  SELECT * FROM pxtest1;
> 
> I do not know what happened to the postmaster, but seeing that chipmunk
> is a very small machine running a pretty old Linux kernel, it's plausible
> to guess that the OOM killer decided to pick on the postmaster.  (I wonder
> whether Heikki has taken any steps to prevent that on that machine.)

For the record, it was not the OOM killer. It was the buildfarm cron job 
that did it:

Oct 14 08:57:01 raspberrypi /USR/SBIN/CRON[27050]: (pgbfarm) CMD 
(killall -q -9 postgres; cd /home/pgbfarm/build-farm-client/ && 
./run_branches.pl --run-all)

Apparently building and testing all the branches is now taking slightly 
more than 24 h on that system, so the next day's cron job kills the 
previous tests. I'm going to change the cron schedule so that it runs 
only every other day.

- Heikki



Re: kevent latch paths don't handle postmaster death well

От
Thomas Munro
Дата:
On Thu, Oct 15, 2020 at 8:40 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thomas Munro <thomas.munro@gmail.com> writes:
> > The process exit event is like an 'edge', not a 'level'... hmm.  It
> > might be enough to set report_postmaster_not_running = true the first
> > time it tells us so if we try to wait again we'll treat it like a
> > level.  I will look into it later today.
>
> Seems like having that be per-WaitEventSet state is also not a great
> idea --- if we detect PM death while waiting on one WES, and then
> wait on another one, it won't work.  A plain process-wide static
> variable would be a better way I bet.

I don't think that's a problem -- the kernel will report the event to
each interested kqueue object.  The attached fixes the problem for me.

Вложения

Re: kevent latch paths don't handle postmaster death well

От
Tom Lane
Дата:
Thomas Munro <thomas.munro@gmail.com> writes:
> On Thu, Oct 15, 2020 at 8:40 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Seems like having that be per-WaitEventSet state is also not a great
>> idea --- if we detect PM death while waiting on one WES, and then
>> wait on another one, it won't work.  A plain process-wide static
>> variable would be a better way I bet.

> I don't think that's a problem -- the kernel will report the event to
> each interested kqueue object.  The attached fixes the problem for me.

Oh, OK.  I confirm this makes the kqueue path work like the EPOLL and POLL
paths.  (I can't test the WIN32 path.)

            regards, tom lane



Re: kevent latch paths don't handle postmaster death well

От
Thomas Munro
Дата:
On Thu, Oct 15, 2020 at 11:18 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Thomas Munro <thomas.munro@gmail.com> writes:
> > On Thu, Oct 15, 2020 at 8:40 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> Seems like having that be per-WaitEventSet state is also not a great
> >> idea --- if we detect PM death while waiting on one WES, and then
> >> wait on another one, it won't work.  A plain process-wide static
> >> variable would be a better way I bet.
>
> > I don't think that's a problem -- the kernel will report the event to
> > each interested kqueue object.  The attached fixes the problem for me.
>
> Oh, OK.  I confirm this makes the kqueue path work like the EPOLL and POLL
> paths.  (I can't test the WIN32 path.)

Thanks.  Pushed.

(Hmm, I wonder about that Windows process exit event.)



Re: kevent latch paths don't handle postmaster death well

От
Tom Lane
Дата:
Thomas Munro <thomas.munro@gmail.com> writes:
> (Hmm, I wonder about that Windows process exit event.)

If anyone wants to test that, I can save you a little time building
infrastructure, perhaps.  I used the attached program built into a .so.
After creating the function, invoke it, and once it's blocked kill -9
the postmaster.  If it successfully reports multiple WL_POSTMASTER_DEATH
results then it's good.

            regards, tom lane

/*

create function wait_on_process_latch() returns void
strict volatile language c as '.../whatever.so';

select wait_on_process_latch();

 */

#include "postgres.h"

#include "fmgr.h"
#include "miscadmin.h"
#include "storage/latch.h"
#include "pgstat.h"

PG_MODULE_MAGIC;

/*
 * wait_on_process_latch() returns void
 */
PG_FUNCTION_INFO_V1(wait_on_process_latch);
Datum
wait_on_process_latch(PG_FUNCTION_ARGS)
{
    int count = 10;

    while (count-- > 0)
    {
        int            rc;

        CHECK_FOR_INTERRUPTS();

        elog(INFO, "waiting on latch");

        rc = WaitLatch(MyLatch,
                       WL_LATCH_SET | WL_POSTMASTER_DEATH, 0,
                       WAIT_EVENT_BGWORKER_SHUTDOWN);

        if (rc & WL_POSTMASTER_DEATH)
        {
            elog(INFO, "got WL_POSTMASTER_DEATH report");
        }

        ResetLatch(MyLatch);
    }

    PG_RETURN_VOID();
}

Re: kevent latch paths don't handle postmaster death well

От
Andres Freund
Дата:
Hi,

On 2020-10-15 11:10:28 +1300, Thomas Munro wrote:
> I don't think that's a problem -- the kernel will report the event to
> each interested kqueue object.  The attached fixes the problem for me.

Will it do so even if the kqueue is created after postmaster death?

- Andres



Re: kevent latch paths don't handle postmaster death well

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> On 2020-10-15 11:10:28 +1300, Thomas Munro wrote:
>> I don't think that's a problem -- the kernel will report the event to
>> each interested kqueue object.  The attached fixes the problem for me.

> Will it do so even if the kqueue is created after postmaster death?

I did not try to test it, but there's code that purports to handle that
in latch.c, ~ line 1150, and the behavior it's expecting mostly agrees
with what I read in the macOS kevent man page.  One thing I'd suggest
is that EACCES probably needs to be treated as "postmaster already dead",
too, in case the PID is now owned by another user ID.

            regards, tom lane



Re: kevent latch paths don't handle postmaster death well

От
Thomas Munro
Дата:
On Thu, Oct 15, 2020 at 12:14 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2020-10-15 11:10:28 +1300, Thomas Munro wrote:
> >> I don't think that's a problem -- the kernel will report the event to
> >> each interested kqueue object.  The attached fixes the problem for me.
>
> > Will it do so even if the kqueue is created after postmaster death?
>
> I did not try to test it, but there's code that purports to handle that
> in latch.c, ~ line 1150, and the behavior it's expecting mostly agrees
> with what I read in the macOS kevent man page.  One thing I'd suggest

Yep, I did handle the obvious races here.

> is that EACCES probably needs to be treated as "postmaster already dead",
> too, in case the PID is now owned by another user ID.

Good point.  I'll push that change later today.



Re: kevent latch paths don't handle postmaster death well

От
Thomas Munro
Дата:
On Thu, Oct 15, 2020 at 12:55 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> On Thu, Oct 15, 2020 at 12:14 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > I did not try to test it, but there's code that purports to handle that
> > in latch.c, ~ line 1150, and the behavior it's expecting mostly agrees
> > with what I read in the macOS kevent man page.  One thing I'd suggest
> > is that EACCES probably needs to be treated as "postmaster already dead",
> > too, in case the PID is now owned by another user ID.
>
> Good point.  I'll push that change later today.

I tried to test this on my system but it seems like maybe FreeBSD
can't really report EACCES for EVFILT_PROC.  From the man page and a
quick inspection of the source, you only have to be able to "see" the
process, and if you can't I think you'll get ESRCH, so EACCES may be
for other kinds of filters.  I don't currently have any Apple gear to
hand, but its man page uses the same language, but on the other hand I
do see EACCES in filt_procattach() in the darwin-xnu sources on github
so I guess you can reach this case and get an ugly ereport (hopefully
followed swiftly by a proc_exit() from the next wait on one of the
long lived WESs, or a FATAL if this was the creation of one of those).
  Fixed.  Thanks!



Re: kevent latch paths don't handle postmaster death well

От
Thomas Munro
Дата:
On Thu, Oct 15, 2020 at 6:42 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> I tried to test this on my system but it seems like maybe FreeBSD
> can't really report EACCES for EVFILT_PROC.  From the man page and a
> quick inspection of the source, you only have to be able to "see" the
> process, and if you can't I think you'll get ESRCH, so EACCES may be
> for other kinds of filters.  I don't currently have any Apple gear to
> hand, but its man page uses the same language, but on the other hand I
> do see EACCES in filt_procattach() in the darwin-xnu sources on github
> so I guess you can reach this case and get an ugly ereport (hopefully
> followed swiftly by a proc_exit() from the next wait on one of the
> long lived WESs, or a FATAL if this was the creation of one of those).

I couldn't resist digging further into the Apple sources to figure out
what was going on there, and I realised that the code path I was
looking at can only report EACCES if you asked for NOTE_EXITSTATUS,
which appears to be an Apple extension to the original FreeBSD kqueue
system designed to let you receive the exit status of the monitored
process.  That is indeed much more privileged information, and it's
only allowed for your own children.  So it's possible that commit
70516a17 was a waste of electrons, but I don't think it can hurt;
either way, our system is toast if we get that error, so it's mostly
just a question of what sort of noises we make as we fail, if indeed
any system really can produce EACCES for NOTE_EXIT (maybe in some
other code path I haven't found, or some other cousin BSD).



Re: kevent latch paths don't handle postmaster death well

От
Tom Lane
Дата:
Thomas Munro <thomas.munro@gmail.com> writes:
> I couldn't resist digging further into the Apple sources to figure out
> what was going on there, and I realised that the code path I was
> looking at can only report EACCES if you asked for NOTE_EXITSTATUS,
> which appears to be an Apple extension to the original FreeBSD kqueue
> system designed to let you receive the exit status of the monitored
> process.  That is indeed much more privileged information, and it's
> only allowed for your own children.

Ah.

> So it's possible that commit
> 70516a17 was a waste of electrons, but I don't think it can hurt;

Yeah, I'm not inclined to revert it.  If we did get that errno,
it'd be hard to interpret it in any way that didn't involve the
postmaster being gone.

            regards, tom lane