Re: Some 9.5beta2 backend processes not terminating properly?
От | Andres Freund |
---|---|
Тема | Re: Some 9.5beta2 backend processes not terminating properly? |
Дата | |
Msg-id | 20160102132647.mlwrv7dvtc3qzki5@alap3.anarazel.de обсуждение исходный текст |
Ответ на | Re: Some 9.5beta2 backend processes not terminating properly? (Amit Kapila <amit.kapila16@gmail.com>) |
Ответы |
Re: Some 9.5beta2 backend processes not terminating
properly?
|
Список | pgsql-hackers |
On 2016-01-02 18:40:38 +0530, Amit Kapila wrote: > What I wanted to say is that the handling of socket closure is not > same in WaitLatchOrSocket() and pgwin32_waitforsinglesocket() > due to which this problem can arise and it seems that is the > right line of direction to pursue. I have found that > in WaitLatchOrSocket(), > even when the socket is closed, we remember the result as > WL_SOCKET_READABLE and again tries to wait whereas the > same is handled properly in pgwin32_waitforsinglesocket(). That's actually intentional, and part of the design:* When waiting on a socket, EOF and error conditions are reported by*returning the socket as readable/writable or both, depending on* WL_SOCKET_READABLE/WL_SOCKET_WRITEABLE being specified. The way this is supposed to work, and does on unixoid systems, is that WaitLatchOS returns, the recv is retried and signals an error. > If we > remember the closed socket event and then take appropriate action, > then this problem won't happen. Attached patch which by no-means > a complete fix shows what I wanted to say and after this the problem > mentioned by Shay doesn't happen, although I get LOG message > which is due to the reason that proper handling for socket closure > needs to be done in this path. This patch is based on the code > after commit 387da18874afa17156ee3af63766f17efb53c4b9. I > will do testing and refine the fix based on HEAD later as I am done > for the today. It's weird that this fixes the problem. As we were previously, according to Shay, not busy looping, this seems to indicate that FD_CLOSE is only reported once or somesuch? It'd be very interesting to add a debug elog() into the if (resEvents.lNetworkEvents & FD_CLOSE) { if (wakeEvents & WL_SOCKET_READABLE) result |= WL_SOCKET_READABLE; if (wakeEvents & WL_SOCKET_WRITEABLE) result |= WL_SOCKET_WRITEABLE; } path in WaitLatchOrSocket. If it actually returns with the current code, we have a better idea where to look for problems. Greetings, Andres Freund
В списке pgsql-hackers по дате отправления: