Re: Pg stuck at 100% cpu, for multiple days
От | Joe Conway |
---|---|
Тема | Re: Pg stuck at 100% cpu, for multiple days |
Дата | |
Msg-id | 257d9bd3-6cd4-4307-2a6d-f78a5b9eba7d@joeconway.com обсуждение исходный текст |
Ответ на | Re: Pg stuck at 100% cpu, for multiple days (Justin Pryzby <pryzby@telsasoft.com>) |
Ответы |
Re: Pg stuck at 100% cpu, for multiple days
Re: Pg stuck at 100% cpu, for multiple days |
Список | pgsql-hackers |
On 8/30/21 3:34 PM, Justin Pryzby wrote: > On Mon, Aug 30, 2021 at 09:09:20PM +0200, Laurenz Albe wrote: >> On Mon, 2021-08-30 at 17:18 +0200, hubert depesz lubaczewski wrote: >> > The thing is - I can't close it with pg_terminate_backend(), and I'd >> > rather not kill -9, as it will, I think, close all other connections, >> > and this is prod server. >> >> Of course the cause should be fixed, but to serve your immediate need: > > You might save a coredump of the process using gdb gcore before killing it, in > case someone thinks how to debug it next month. > > Depending on your OS, you might have to do something special to get shared > buffers included in the dump (or excluded, if that's what's desirable). > > I wonder how far up the stacktrace it's stuck ? > You could set a breakpoint on LogicalDecodingProcessRecord and then "c"ontinue, > and see if it hits the breakpoint in a few seconds. If not, try the next > frame until you know which one is being called repeatedly. > > Maybe CheckForInterrupts should be added somewhere... The spot in the backtrace... #0 hash_seq_search (status=status@entry=0xffffdd90f380) at ./build/../src/backend/utils/hash/dynahash.c:1448 ...is in the middle of this while loop: 8<----------------------------------------- while ((curElem = segp[segment_ndx]) == NULL) { /* empty bucket, advance to next */ if (++curBucket > max_bucket) { status->curBucket = curBucket; hash_seq_term(status); return NULL; /* search is done */ } if (++segment_ndx >= ssize) { segment_num++; segment_ndx = 0; segp = hashp->dir[segment_num]; } } 8<----------------------------------------- It would be interesting to step through a few times to see if it is really stuck in that loop. That would be consistent with 100% CPU and not checking for interrupts I think. Joe -- Crunchy Data - http://crunchydata.com PostgreSQL Support for Secure Enterprises Consulting, Training, & Open Source Development
В списке pgsql-hackers по дате отправления: