Re: 8.4-vintage problem in postmaster.c
От | Stefan Kaltenbrunner |
---|---|
Тема | Re: 8.4-vintage problem in postmaster.c |
Дата | |
Msg-id | 4CED5674.5070800@kaltenbrunner.cc обсуждение исходный текст |
Ответ на | Re: 8.4-vintage problem in postmaster.c (Alvaro Herrera <alvherre@commandprompt.com>) |
Список | pgsql-hackers |
On 11/15/2010 03:24 PM, Alvaro Herrera wrote: > Excerpts from Tom Lane's message of sáb nov 13 19:07:50 -0300 2010: >> Stefan Kaltenbrunner<stefan@kaltenbrunner.cc> writes: >>> On 11/13/2010 06:58 PM, Tom Lane wrote: >>>> Just looking at it, I think that the logic in canAcceptConnections got >>>> broken by somebody in 8.4, and then broken some more in 9.0: in some >>>> cases it will return an "okay to proceed" status without having checked >>>> for TOOMANY children. Was this system possibly in PM_WAIT_BACKUP or >>>> PM_HOT_STANDBY state? What version was actually running? >> >>> I don't have too many details on the actual setup (working on that) but >>> the box in question is running 8.4.2 and had no issues before the >>> upgrade to 8.4 (ie 8.3 was reported to work fine - so a 8.4+ breakage >>> looks plausible). >> >> Well, this failure would certainly involve a flood of connection >> attempts, so it's possible it's a pre-existing bug that they just did >> not happen to trip over before. But the sequence of events that I'm >> thinking about is a smart shutdown attempt (SIGTERM to postmaster) >> while an online backup is in progress, followed by a flood of >> near-simultaneous connection attempts while the backup is still active. > > As far as I could gather from Stefan's description, I think this is > pretty unlikely. It seems to me that the "too many children" error > message is very common in the 8.3 setup already, and the only reason > they have a problem on 8.4 is that it crashes instead. not sure if that is true - but 8.4 crashes whereas 8.3 just (seems to) works - the issue is still there with 8_4_STABLE... DEBUG3 level output (last few hours - 7MB in size) is available under http://www.kaltenbrunner.cc/files/postgresql-2010-11-24_143513.log From looking at the code I'm not immediatly seeing what is going wrong here but maybe somebody else has an idea. Stefan
В списке pgsql-hackers по дате отправления: