Re: [HACKERS] Backends waiting, spinlocks, shared mem patches
От | Tom Lane |
---|---|
Тема | Re: [HACKERS] Backends waiting, spinlocks, shared mem patches |
Дата | |
Msg-id | 5643.928390976@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: [HACKERS] Backends waiting, spinlocks, shared mem patches (Wayne Piekarski <wayne@senet.com.au>) |
Список | pgsql-hackers |
Wayne Piekarski <wayne@senet.com.au> writes: > Unfortunately, this is not the kind of thing I can reproduce with a > testing program, and so I can't try it against 6.5 - but it still exists > in 6.4.2 so unless someones made more changes related to this area, there > might be a chance it is still in 6.5 - although the locking code has been > changed a lot maybe not? I honestly don't know what to tell you here. There have been a huge number of changes and bugfixes between 6.4.2 and 6.5, but there's really no way to guess from your report whether any of them will cure your problem (or, perhaps, make it worse :-(). I wish you could run 6.5- current for a while under your live load and see how it fares. But I understand your reluctance to do that. > Is there anything I can do, like enable some extra debugging code, There is some debug logging code in the lockmanager, but it produces a huge volume of log output when turned on, and I for one am not qualified to decipher it (perhaps one of the other list members can offer more help). What I'd suggest first is trying to verify that it *is* a lock problem. Attaching to some of the hung backends with gdb and dumping their call stacks with "bt" could be very illuminating. Especially if you compile the backend with -g first. > One thing I thought is this problem could still be related to the > spinlock/semget problem. ie, too many backends start up, something fails > and dies off, but leaves a semaphore laying around, and so from then > onwards, all the backends are waiting for this semaphore to go when it is > still hanging around, causing problems ... IIRC, 6.4.* will absolutely *not* recover from running out of kernel semaphores or backend process slots. This is fixed in 6.5, and I think someone posted a patch for 6.4 that covers the essentials, but I do not recall the details. regards, tom lane
В списке pgsql-hackers по дате отправления: