On Tue, Apr 10, 2018 at 1:37 PM, Peter Geoghegan <pg@bowt.ie> wrote:
> _bt_mark_page_halfdead() looked like it had a problem, but it now
> looks like I was wrong.
I did find another problem, though. Looks like the idea to explicitly
represent the number of attributes directly has paid off already:
pg@~[3711]=# create table covering_bug (f1 int, f2 int, f3 text);
create unique index cov_idx on covering_bug (f1) include(f2);
insert into covering_bug select i, i * random() * 1000, i * random() *
100000 from generate_series(0,100000) i;
DEBUG: building index "pg_toast_16451_index" on table "pg_toast_16451" serially
CREATE TABLE
DEBUG: building index "cov_idx" on table "covering_bug" serially
CREATE INDEX
ERROR: tuple has wrong number of attributes in index "cov_idx"
Note that amcheck can detect the issue with the index after the fact, too:
pg@~[3711]=# select bt_index_check('cov_idx');
ERROR: wrong number of index tuple attributes for index "cov_idx"
DETAIL: Index tid=(3,2) natts=2 points to index tid=(2,92) page lsn=0/170DC88.
I don't think that the issue is complicated. Looks like we missed a
place that we have to truncate within _bt_split(), located directly
after this comment block:
/*
* If the page we're splitting is not the rightmost page at its level in
* the tree, then the first entry on the page is the high key for the
* page. We need to copy that to the right half. Otherwise (meaning the
* rightmost page case), all the items on the right half will be user
* data.
*/
I believe that the reason that we didn't find this bug prior to commit
is that we only have a single index tuple with the wrong number of
attributes after an initial root page split through insertions, but
the next root page split masks the problems. Not 100% sure that that's
why we missed it just yet, though.
This bug shouldn't be hard to fix. I'll take care of it as part of
that post-commit review patch I'm working on.
--
Peter Geoghegan