Re: [HACKERS] Problem after removal of exec(), help
От | Goran Thyni |
---|---|
Тема | Re: [HACKERS] Problem after removal of exec(), help |
Дата | |
Msg-id | 358F8D5E.7EE08AC6@bildbasen.se обсуждение исходный текст |
Ответ на | Problem after removal of exec(), help (Bruce Momjian <maillist@candle.pha.pa.us>) |
Ответы |
Re: [HACKERS] Problem after removal of exec(), help
|
Список | pgsql-hackers |
Bruce Momjian wrote: > > Since the removal of exec(), Thomas has seen, and I have confirmed that > if a backend crashes, and the postmaster must reset the shared memory, > no backends can connect anymore. One way to reproduce it is to run the > regression tests, which on their last test will crash for an un-related > reason. However, it will not allow you to restart any more backends. > > The error it gets is: > > Failed Assertion("!((((unsigned long)nextElem) > ShmemBase)):", File: "shmqueue. > c", Line: 83) > !((((unsigned long)nextElem) > ShmemBase)) (0) [No such file or directory] > > In this case nextElem = ShmemBase, so it is not greater. Removing the > Assert() still does not make things work, so there must be something > else. > > Now, the problem is probably not at that exact spot, but somewhere > deeper. There are two differences between the old non-exec() behavior > and new behavior. In the old setup, the backend had all its global > variables initialized, while in the new no-exec case, they take the > global variable values from the postmaster. Second, the old setup had > each backend attaching to the shared memory, while the new setup has > them inheriting the shared memory from the fork(). Bruce, I have not look into it the specifics yet, but I suggest looking into what is done when the child process exits. This (the pg_exit() et al.) caused some bugs when we introduced unix domain sockets and it is not the first place one looks. :-( regards, -- --------------------------------------------- Göran Thyni, sysadm, JMS Bildbasen, Kiruna
В списке pgsql-hackers по дате отправления: