Re: Bug with GIN index over JSONB data: "ERROR: buffer 10112 is not owned by resource owner"
От | Tom Lane |
---|---|
Тема | Re: Bug with GIN index over JSONB data: "ERROR: buffer 10112 is not owned by resource owner" |
Дата | |
Msg-id | 1877494.1699843347@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: Bug with GIN index over JSONB data: "ERROR: buffer 10112 is not owned by resource owner" (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-bugs |
I wrote: > Jeff Janes <jeff.janes@gmail.com> writes: >> I was looking into a possible scalability problem with GIN indexes under >> concurrent insert, but instead I found an uncharacterized bug. One of the >> processes will occasionally throw an error "ERROR: buffer 10112 is not >> owned by resource owner Portal" where the buffer number changes from run to >> run. > I am able to reproduce this in HEAD (8bfb231b4) on a not-that-big > machine (M2 Mac Mini): I have tracked down the cause of this: there is one code path in ginFindParents() that doesn't take the function's own advice to not release pin on the index's root page. The attached seems to fix it. AFAICS the bug goes clear back to the origin of GIN. > However, it seems like there might be more than one bug. My first > attempt died like this after a few minutes: > TRAP: failed Assert("ref != NULL"), File: "bufmgr.c", Line: 2447, PID: 7177 > ... > That was while running a build of commit 9ba9c7074 from 25 October. > After updating to current HEAD (8bfb231b4), I've not yet reproduced it. Nothing to see there: that is the same bug. The change in behavior is accounted for by the intervening commit b8bff07da, which re-ordered operations in bufmgr.c so that the lack of a resource manager entry would be noticed before hitting this Assert. I find it quite scary that, per the code coverage report, ginFindParents() isn't reached at all during our regression tests. And there are several other areas in ginbtree.c that aren't reached either. Even while running Jeff's test case, I can find no evidence that the freestack == false path in ginFinishSplit() is ever reached. There could be a pile of resource-mismanagement bugs similar to this one in there, and we'd never know it. Reaching this particular error requires a concurrent split of the index's root page, so it's surely not that easy to trigger. I wonder if there's anything we could do to make such cases more testable. regards, tom lane diff --git a/src/backend/access/gin/ginbtree.c b/src/backend/access/gin/ginbtree.c index fc694b40f1..2dd3853e70 100644 --- a/src/backend/access/gin/ginbtree.c +++ b/src/backend/access/gin/ginbtree.c @@ -272,7 +272,11 @@ ginFindParents(GinBtree btree, GinBtreeStack *stack) blkno = GinPageGetOpaque(page)->rightlink; if (blkno == InvalidBlockNumber) { - UnlockReleaseBuffer(buffer); + /* Do not release pin on the root buffer */ + if (buffer != root->buffer) + UnlockReleaseBuffer(buffer); + else + LockBuffer(buffer, GIN_UNLOCK); break; } buffer = ginStepRight(buffer, btree->index, GIN_EXCLUSIVE);
В списке pgsql-bugs по дате отправления: