Re: [HACKERS] Major bug, possible, with Solaris 7?
От | Tom Lane |
---|---|
Тема | Re: [HACKERS] Major bug, possible, with Solaris 7? |
Дата | |
Msg-id | 21979.919547332@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | RE: [HACKERS] Major bug, possible, with Solaris 7? ("Daryl W. Dunbar" <daryl@www.com>) |
Список | pgsql-hackers |
"Daryl W. Dunbar" <daryl@www.com> writes: > Problem still exists in 6.4.3. I figured it probably would :-(. As far as I can tell from your truss trace, the processes are going to sleep via semop() and never being awoken. There's not much more that we can find out at the kernel level, since the kernel can't tell *why* a backend thinks it needs to go to sleep. Assuming that TEST_AND_SET is defined in your compilation, the backend only use one semaphore apiece and all blocking/awakening is done via the same semaphore. We need to know what lock-manager condition is causing each backend to decide to block and why the lock is not getting released. I was hoping that a gdb backtrace would tell us more --- it's bad that you can't get any info that way. On my system (HPUX) gdb has a problem with debugging shared libraries in a process that you attach to, as opposed to starting fresh under gdb. I dunno if Solaris is similar, but it might be worth building your -g version of the backend with no shared libraries, everything linked statically (-static option, I think, when linking the postgres binary). If your system doesn't have a static version of libc then this won't help. But probably the first thing to try at this point is adding a bunch of debugging printouts. If you compile with -DLOCK_MGR_DEBUG (see src/backend/storage/lmgr/lock.c) and turn on the trace-locks option then you'll get a bunch more log output that should tell us something useful about why the processes are deciding to block. regards, tom lane
В списке pgsql-hackers по дате отправления: