Problem after removal of exec(), help

Поиск
Список
Период
Сортировка
От Bruce Momjian
Тема Problem after removal of exec(), help
Дата
Msg-id 199806221445.KAA13553@candle.pha.pa.us
обсуждение исходный текст
Ответы Re: [HACKERS] Problem after removal of exec(), help  (dg@illustra.com (David Gould))
Re: [HACKERS] Problem after removal of exec(), help  (Bruce Momjian <maillist@candle.pha.pa.us>)
Список pgsql-hackers
Since the removal of exec(), Thomas has seen, and I have confirmed that
if a backend crashes, and the postmaster must reset the shared memory,
no backends can connect anymore.  One way to reproduce it is to run the
regression tests, which on their last test will crash for an un-related
reason.  However, it will not allow you to restart any more backends.

The error it gets is:

Failed Assertion("!((((unsigned long)nextElem) > ShmemBase)):", File: "shmqueue.
c", Line: 83)
!((((unsigned long)nextElem) > ShmemBase)) (0) [No such file or directory]

In this case nextElem = ShmemBase, so it is not greater.  Removing the
Assert() still does not make things work, so there must be something
else.

Now, the problem is probably not at that exact spot, but somewhere
deeper.  There are two differences between the old non-exec() behavior
and new behavior.  In the old setup, the backend had all its global
variables initialized, while in the new no-exec case, they take the
global variable values from the postmaster.  Second, the old setup had
each backend attaching to the shared memory, while the new setup has
them inheriting the shared memory from the fork().

My guess is that there is something buggy about the reset code in
postmaster.c that was not resetting completely, but the initialization
of the global variables in the backend was masking the bug, or the
attach() operation did some extra work that we now need to do when
resetting the shared memory:

    static void
    reset_shared(short port)
    {
        ipc_key = port * 1000 + shmem_seq * 100;
        CreateSharedMemoryAndSemaphores(ipc_key);
        ActiveBackends = FALSE;
        shmem_seq += 1;
        if (shmem_seq >= 10)
            shmem_seq -= 10;
    }


I am stumped on this.

--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

В списке pgsql-hackers по дате отправления:

Предыдущее
От: The Hermit Hacker
Дата:
Сообщение: Re: btree: BTP_CHAIN flag was expected (revisited)
Следующее
От: Keith Parks
Дата:
Сообщение: Divide by zero error on SPARC/Linux.