Re: REL_11_STABLE: dsm.c - cannot unpin a segment that is not pinned
От | Thomas Munro |
---|---|
Тема | Re: REL_11_STABLE: dsm.c - cannot unpin a segment that is not pinned |
Дата | |
Msg-id | CAEepm=0t8o=Wh4wi0H58q3G1dqoj6ZYU-zu9DMp29RkVMGSvNw@mail.gmail.com обсуждение исходный текст |
Ответ на | REL_11_STABLE: dsm.c - cannot unpin a segment that is not pinned (Justin Pryzby <pryzby@telsasoft.com>) |
Список | pgsql-hackers |
On Sat, Feb 16, 2019 at 3:38 PM Justin Pryzby <pryzby@telsasoft.com> wrote: > I saw this error once last week while stress testing to reproduce earlier bugs, > but tentatively thought it was a downstream symptom of those bugs (since > fixed), and now wanted to check that #15585 and others were no longer > reproducible. Unfortunately I got this error while running same test case [2] > as for previous bug ('could not attach'). > > 2019-02-14 23:40:41.611 MST [32287] ERROR: cannot unpin a segment that is not pinned > > On commit faf132449c0cafd31fe9f14bbf29ca0318a89058 (REL_11_STABLE including > both of last week's post-11.2 DSA patches), I reproduced twice, once within > ~2.5 hours, once within 30min. > > I'm not able to reproduce on master running overnight and now 16+hours. Oh, I think I know why: dsm_unpin_segment() containt another variant of the race fixed by 6c0fb941 (that was for dsm_attach() being confused by segments with the same handle that are concurrently going away, but dsm_unpin_segment() does a handle lookup too, so it can be confused by the same phenomenon). Untested, but the fix is probably: diff --git a/src/backend/storage/ipc/dsm.c b/src/backend/storage/ipc/dsm.c index cfbebeb31d..23ccc59f13 100644 --- a/src/backend/storage/ipc/dsm.c +++ b/src/backend/storage/ipc/dsm.c @@ -844,8 +844,8 @@ dsm_unpin_segment(dsm_handle handle) LWLockAcquire(DynamicSharedMemoryControlLock, LW_EXCLUSIVE); for (i = 0; i < dsm_control->nitems; ++i) { - /* Skip unused slots. */ - if (dsm_control->item[i].refcnt == 0) + /* Skip unused slots and segments that are concurrently going away. */ + if (dsm_control->item[i].refcnt <= 1) continue; /* If we've found our handle, we can stop searching. */ -- Thomas Munro http://www.enterprisedb.com
В списке pgsql-hackers по дате отправления: