Re: BUG #3504: Some listening sessions never return from writing, problems ensue
От | Peter Koczan |
---|---|
Тема | Re: BUG #3504: Some listening sessions never return from writing, problems ensue |
Дата | |
Msg-id | 4544e0330708091155r2db59ea4w6b2e34cbbc8d3ae3@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: BUG #3504: Some listening sessions never return from writing, problems ensue (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: BUG #3504: Some listening sessions never return from writing, problems ensue
|
Список | pgsql-bugs |
On 8/6/07, Tom Lane <tgl@sss.pgh.pa.us> wrote: > "Peter Koczan" <pjkoczan@gmail.com> writes: > > Here's my theory (and feel free to tell me that I'm full of it)...somehow, a > > lot of notifies happened at once, or in a very short period of time, to the > > point where the app was still processing notifies when the timer clicked off > > another second. The connection (or app, or perl module) never marked those > > notifies as being processed, or never updated its timestamp of when it > > finished, so when the next notify came around, it tried to reprocess the old > > data (or data since the last time it finished), and yet again couldn't > > finish. Lather, rinse, repeat. In sum, it might be that trying to call > > pg_notifies while processing notifies tickles a race condition and tricks > > the connection into thinking its in a bad state. > > Hmm. Is the app trying to do this processing inside an interrupt > service routine (a/k/a signal handler)? If so, and if the ISR can > interrupt itself, then you've got a problem because you'll be doing > reentrant calls of libpq, which it doesn't support. You can only make > that work if the handler blocks further occurrences of its signal until > it finishes. > I'm not entirely sure if this answers your question, but here's what I found out from the primary maintainer of the app. Note that update_reqs is the function calling pg_notifies. If there's more information I can provide or another test we can run, please let me know. ------- BEGIN MESSAGE ------- I just checked and the timer won't interrupt update_reqs, so we'll have to look for another solution. Anyway, update_reqs doesn't do anything with the database except for checking for a notify, so I don't see where it can be interrupted to cause DB problems. ------- END MESSAGE ------- I also found out that one notify gets sent per action (not per batch of actions), so if n requests get resolved at once, n notifies are sent, not 1. In theory this could mitigate this problem, but I don't know how easy it is at this point. Still, it doesn't explain how or why the client's recv-q isn't getting cleared. Hope this helps. Peter
В списке pgsql-bugs по дате отправления: