Re: pg_listener entries deleted under heavy NOTIFY load only on Windows
От | Marshall, Steve |
---|---|
Тема | Re: pg_listener entries deleted under heavy NOTIFY load only on Windows |
Дата | |
Msg-id | 8536F69C1FCC294B859D07B179F0694411A4CEF8@EXCHANGE.ad.wsicorp.com обсуждение исходный текст |
Ответ на | Re: pg_listener entries deleted under heavy NOTIFY load only on Windows (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-bugs |
As I posted before, changing the timeout from 1000 to NMPWAIT_WAIT_FOREVER fixed the problem, or at least moved it such it does not occur easily anymore. To better understand the problem, I added debugging as Tom suggested. I restored timeout on CalledNamedPipe 1000 ms, and reran my tests. Indeed, kill is encountering an error: LOG: kill(2168) failed: No such process I instrumented pgkill to output the value of GetLastError() if CalledNamedPipe fails. It returned error code 2, which Windows identifies as ERROR_FILE_NOT_FOUND. The logic in pgkill translates this Windows error into an errno value of ESRCH. The Windows error is a bit surprising, at least to me -- I expected something indicating the pipe was full. Does anyone have a richer interpretation of this error? Thanks, Steve -----Original Message----- From: Tom Lane [mailto:tgl@sss.pgh.pa.us]=20 Alvaro Herrera <alvherre@commandprompt.com> writes: > Marshall, Steve wrote: >> Any thoughts on how to confirm or deny Theory A? > Try changing the 1000 to NMPWAIT_WAIT_FOREVER As long as you're changing the source code, it'd be a good idea to verify the supposition that kill() is failing, eg in src/backend/commands/async.c if (kill(listenerPID, SIGUSR2) < 0) { + elog(LOG, "kill(%d) failed: %m", listenerPID); /* * Get rid of pg_listener entry if it refers to a PID that no * longer exists. Presumably, that backend crashed without * deleting its pg_listener entries. This code used to only If that's right, sprinkling a few debug printouts into src/port/kill.c would be the next step. regards, tom lane
В списке pgsql-bugs по дате отправления: