Re: BUG #15585: infinite DynamicSharedMemoryControlLock waiting inparallel query
От | Thomas Munro |
---|---|
Тема | Re: BUG #15585: infinite DynamicSharedMemoryControlLock waiting inparallel query |
Дата | |
Msg-id | CAEepm=3ynb5nBhKQRts0bNETA1HzNxz6-3RTPOzCbM8oQ9yPdg@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: BUG #15585: infinite DynamicSharedMemoryControlLock waiting in parallel query (Sergei Kornilov <sk@zsrv.org>) |
Ответы |
Re: BUG #15585: infinite DynamicSharedMemoryControlLock waiting in parallel query
|
Список | pgsql-bugs |
On Thu, Jan 24, 2019 at 11:56 PM Sergei Kornilov <sk@zsrv.org> wrote: > We should not call dsm_backend_shutdown twice in same process, right? So we tried call dsm_detach on same segment 0x5624578710c8twice, but this is unexpected behavior and refcnt would be incorrect. And seems we can not LWLockAcquire lockand then LWLockAcquire same lock again without release. And here we have infinite waiting. Yeah, I think your analysis is right. It shouldn't do so while holding the lock. dsm_unpin_segment() should perhaps release it before it raises an error, something like: diff --git a/src/backend/storage/ipc/dsm.c b/src/backend/storage/ipc/dsm.c index 36904d2676..b989c0b94a 100644 --- a/src/backend/storage/ipc/dsm.c +++ b/src/backend/storage/ipc/dsm.c @@ -924,9 +924,15 @@ dsm_unpin_segment(dsm_handle handle) * called on a segment which is pinned. */ if (control_slot == INVALID_CONTROL_SLOT) + { + LWLockRelease(DynamicSharedMemoryControlLock); elog(ERROR, "cannot unpin unknown segment handle"); + } if (!dsm_control->item[control_slot].pinned) + { + LWLockRelease(DynamicSharedMemoryControlLock); elog(ERROR, "cannot unpin a segment that is not pinned"); + } Assert(dsm_control->item[control_slot].refcnt > 1); /* I have contemplated that before, but not done it because I'm not sure about the state of the system after that; we just shouldn't be in this situation, because if we are, it means that we can error out when later segments (in the array dsa_release_in_place() loops through) remain pinned forever and we'll leak memory and run out of DSM slots. Segment pinning is opting out of resource owner control, which means the client code is responsible for not screwing it up. Perhaps that suggests we should PANIC, or perhaps just LOG and continue, but I'm not sure. I think the root cause is earlier and in a different process (see ProcessInterrupt() in the stack). Presumably one that reported "dsa_area could not attach to segment" is closer to the point where things go wrong. If you are in a position to reproduce this on a modified source tree, it'd be good to see the back trace for that, to figure out which of a couple of possible code paths reach it. Perhaps you could do that by enabling core files and changing this: - elog(ERROR, "dsa_area could not attach to segment"); + elog(PANIC, "dsa_area could not attach to segment"); I have so far not succeeded in reaching that condition. -- Thomas Munro http://www.enterprisedb.com
В списке pgsql-bugs по дате отправления: