Re: [HACKERS] backend freezeing on win32 fixed (I hope ;-) )
От | Tom Lane |
---|---|
Тема | Re: [HACKERS] backend freezeing on win32 fixed (I hope ;-) ) |
Дата | |
Msg-id | 21783.934911055@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: [HACKERS] backend freezeing on win32 fixed (I hope ;-) ) (Bruce Momjian <maillist@candle.pha.pa.us>) |
Ответы |
Re: [HACKERS] backend freezeing on win32 fixed (I hope ;-) )
|
Список | pgsql-hackers |
Bruce Momjian <maillist@candle.pha.pa.us> writes: >> storage/ipc/ipc.c ). Why it is, I don't know, but it seems that my solution >> uses the ipc library in the right way. There are no longer any error >> messages from the ipc library when running the server. And I can't say that >> the ipc library is a 100% correct implementation of SysV IPC, it is probably >> (sure ;-) )caused by the Windows internals. > Seems we may have to use the patch, or make some other patch for NT-only > that works around this NT bug. I don't have a problem with installing an NT patch (lord knows there are plenty of #ifdef __CYGWIN32__'s in the code already). But I have a problem with *this* patch because I don't believe we understand what it is doing, and therefore I have no confidence in it. The extent of our understanding so far is that one backend can create a semaphore that can be used by a later backend, but the postmaster cannot create a semaphore that can be used by a later backend. I don't really believe that; I think there is something else going on. Until we understand what the something else is, I don't think we have a trustworthy solution. The real reason I feel itchy about this is that I know that interprocess synchronization is a very tricky area, so I'm not confident that the limited amount of testing Dan can do by himself proves that things are solid. As the old saw goes, "testing cannot prove the absence of bugs". I want to have both clean test results *and* an understanding of what we are doing before I will feel comfortable. Looking again at the code, it occurs to me that a backend exiting normally will probably leave its semaphore set nonzero, which could (given a buggy IPC library) have something to do with whether another process can attach to the sema or not. The postmaster code is *trying* to create the semas with nonzero starting values, but I see that the backend code takes the additional step of doing semun.val = IpcSemaphoreDefaultStartValue; semctl(semId, semNum, SETVAL,semun); whereas the postmaster code doesn't. Maybe the create call isn't initializing the semaphores the way it's told to? It'd be worth trying adding a step like this to the postmaster preallocation. In any case, I'd really like us to get some feedback from the author of cygipc about this issue. I don't mind working around a bug once we understand exactly what the bug is --- but in this particular area, I think guessing our way to a workaround isn't good enough. regards, tom lane
В списке pgsql-hackers по дате отправления: