Re: Failure while inserting parent tuple to B-tree is not fun
| От | Heikki Linnakangas |
|---|---|
| Тема | Re: Failure while inserting parent tuple to B-tree is not fun |
| Дата | |
| Msg-id | 5285071B.1040100@vmware.com обсуждение исходный текст |
| Ответ на | Re: Failure while inserting parent tuple to B-tree is not fun (Heikki Linnakangas <hlinnakangas@vmware.com>) |
| Ответы |
Re: Failure while inserting parent tuple to B-tree is not fun
|
| Список | pgsql-hackers |
On 05.11.2013 15:07, Heikki Linnakangas wrote: > On 25.10.2013 22:13, Heikki Linnakangas wrote: >> On 22.10.2013 19:55, Heikki Linnakangas wrote: >>> I fixed the the same problem in GiST a few years back, by making it >>> tolerate missing downlinks, and inserting them lazily. The B-tree code >>> tolerates them already on scans, but gets confused on insertion, as seen >>> above. I propose that we use the same approach I used with GiST, and add >>> a flag to the page header to indicate "the downlink hasn't been inserted >>> yet". When insertion (or vacuum) bumps into a flagged page, it can >>> finish the incomplete action by inserting the downlink. >> >> This is what I came up with. >> >> One thing I'm not totally happy about is the way page deletions of >> incompletely split pages are handled. Basically, it just bails out and >> refuses to delete a page that is part of an incomplete split. That's >> probably OK in practice, as incomplete splits should be very rare >> anyway, but it's a bit dissatisfying to not handle the case because at >> first glance it seems like it should be even simpler than usual to >> delete a page that has no downlink. Nevertheless, I decided to just skip >> that for now. >> >> After this patch, deleting the only child of a parent and the parent >> itself is still a multi-WAL-record operation that needs to be tracked >> during recovery, and completed at the end of recovery. I'd like to >> eliminate that too, but that's another patch. > > Here's a new version of this, which uses a similar technique to handle > page deletions, eliminating the "incomplete action" tracking code > altogether (from btree). When an internal page is marked as half-dead, > its right sibling is atomically marked with a > "left-sibling-is-half-dead" flag. Whenever an insertion encounters a > page with that flag set, it will finish the deletion of the left sibling > before proceeding with the insertion. > > This needs a lot more testing, but I wanted to get this out for review, > in case someone sees a fundamental problem with this. Ok, here's a new version of the patch to handle incomplete B-tree splits. I rejected the scheme I outlined above about handling half-dead pages - instead, see the patch in the "Race condition in b-tree page deletion" thread: http://www.postgresql.org/message-id/5283A2FC.6040606@vmware.com. That approach to handling half-dead pages makes the incomplete-split stuff a lot easier, as half-dead pages don't need any special treatment wrt. splits- This patch should be after that patch. - Heikki
Вложения
В списке pgsql-hackers по дате отправления: