Re: Failure while inserting parent tuple to B-tree is not fun
От | Heikki Linnakangas |
---|---|
Тема | Re: Failure while inserting parent tuple to B-tree is not fun |
Дата | |
Msg-id | 52EBD8AC.8070609@vmware.com обсуждение исходный текст |
Ответ на | Re: Failure while inserting parent tuple to B-tree is not fun (Peter Geoghegan <pg@heroku.com>) |
Ответы |
Re: Failure while inserting parent tuple to B-tree is not fun
|
Список | pgsql-hackers |
On 01/30/2014 12:46 AM, Peter Geoghegan wrote: > On Mon, Jan 27, 2014 at 10:54 AM, Peter Geoghegan <pg@heroku.com> wrote: >> On Mon, Jan 27, 2014 at 10:27 AM, Heikki Linnakangas >> <hlinnakangas@vmware.com> wrote: >>>> I think I see some bugs in _bt_moveright(). If you examine >>>> _bt_finish_split() in detail, you'll see that it doesn't just drop the >>>> write buffer lock that the caller will have provided (per its >>>> comments) - it also drops the buffer pin. It calls _bt_insert_parent() >>>> last, which was previously only called last thing by _bt_insertonpg() >>>> (some of the time), and _bt_insertonpg() is indeed documented to drop >>>> both the lock and the pin. And if you look at _bt_moveright() again, >>>> you'll see that there is a tacit assumption that the buffer pin isn't >>>> dropped, or at least that "opaque" (the BTPageOpaque BT special page >>>> area pointer) continues to point to the same page held in the same >>>> buffer, even though the code doesn't set the "opaque' pointer again >>>> and it may not point to a pinned buffer or even the appropriate >>>> buffer. Ditto "page". So "opaque" may point to the wrong thing on >>>> subsequent iterations - you don't do anything with the return value of >>>> _bt_getbuf(), unlike all of the existing call sites. I believe you >>>> should store its return value, and get the held page and the special >>>> pointer on the page, and assign that to the "opaque" pointer before >>>> the next iteration (an iteration in which you very probably really do >>>> make progress not limited to completing a page split, the occurrence >>>> of which the _bt_moveright() loop gets "stuck on"). You know, do what >>>> is done in the non-split-handling case. It may not always be the same >>>> buffer as before, obviously. >>> >>> Yep, fixed. >> >> Can you explain what the fix was, please? > > Ping? I would like to hear some details here. I refactored the loop in _bt_moveright to, well, not have that bug anymore. The 'page' and 'opaque' pointers are now fetched at the beginning of the loop. Did I miss something? - Heikki
В списке pgsql-hackers по дате отправления: