Re: [HACKERS] intermittent failures in Cygwin from select_parallel tests
От | Robert Haas |
---|---|
Тема | Re: [HACKERS] intermittent failures in Cygwin from select_parallel tests |
Дата | |
Msg-id | CA+TgmoYaqJQKtvvbATFzsTsWVZkoB-rff16Ts4osn0fCzVe=CA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [HACKERS] intermittent failures in Cygwin from select_parallel tests (Amit Kapila <amit.kapila16@gmail.com>) |
Ответы |
Re: [HACKERS] intermittent failures in Cygwin from select_parallel tests
|
Список | pgsql-hackers |
On Thu, Jun 15, 2017 at 10:21 AM, Amit Kapila <amit.kapila16@gmail.com> wrote: > Yes, I think it is for next query. If you refer the log below from lorikeet: > > 2017-06-13 16:44:57.179 EDT [59404ec6.2758:63] LOG: statement: > EXPLAIN (analyze, timing off, summary off, costs off) SELECT * FROM > tenk1; > 2017-06-13 16:44:57.247 EDT [59404ec9.2e78:1] ERROR: could not map > dynamic shared memory segment > 2017-06-13 16:44:57.248 EDT [59404dec.2d9c:5] LOG: worker process: > parallel worker for PID 10072 (PID 11896) exited with exit code 1 > 2017-06-13 16:44:57.273 EDT [59404ec6.2758:64] LOG: statement: select > stringu1::int2 from tenk1 where unique1 = 1; > TRAP: FailedAssertion("!(BackgroundWorkerData->parallel_register_count > - BackgroundWorkerData->parallel_terminate_count <= 1024)", File: > "/home/andrew/bf64/root/HEAD/pgsql.build/../pgsql/src/backend/postmaster/bgworker.c", > Line: 974) > 2017-06-13 16:45:02.652 EDT [59404dec.2d9c:6] LOG: server process > (PID 10072) was terminated by signal 6: Aborted > 2017-06-13 16:45:02.652 EDT [59404dec.2d9c:7] DETAIL: Failed process > was running: select stringu1::int2 from tenk1 where unique1 = 1; > 2017-06-13 16:45:02.652 EDT [59404dec.2d9c:8] LOG: terminating any > other active server processes > > Error "could not map dynamic shared memory segment" is due to query > "EXPLAIN .. SELECT * FROM tenk1" and Assertion failure is due to > another statement "select stringu1::int2 from tenk1 where unique1 = > 1;". I think you're right. So here's a theory: 1. The ERROR mapping the DSM segment is just a case of the worker the losing a race, and isn't a bug. 2. But when that happens, parallel_terminate_count is getting bumped twice for some reason. 3. So then the leader process fails that assertion when it tries to launch the parallel workers for the next query. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления: