Re: [HACKERS] memory layouts for binary search in nbtree
От | Peter Geoghegan |
---|---|
Тема | Re: [HACKERS] memory layouts for binary search in nbtree |
Дата | |
Msg-id | CAH2-WzmW9LjTfzp7uY4CHsF3NH0YM-_pow3LXRh18ORLgeu48A@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: memory layouts for binary search in nbtree (Peter Geoghegan <pg@heroku.com>) |
Ответы |
Re: [HACKERS] memory layouts for binary search in nbtree
|
Список | pgsql-hackers |
On Thu, May 19, 2016 at 7:28 PM, Peter Geoghegan <pg@heroku.com> wrote: > Abbreviated keys in indexes are supposed to help with this. Basically, > the ItemId array is made to be interlaced with small abbreviated keys > (say one or two bytes), only in the typically less than 1% of pages > that are internal (leaf page binary searches don't change). I looked at this again recently. I wrote a patch to prove to myself that we can fairly easily reclaim 15 bits from every nbtree internal page ItemId, and put an abbreviated key there instead. The patch has nbtree tell PageAdditem() to store an abbreviated key within lp_len (actually, we call PageAddItemAbbrev() now). I didn't realize that stealing lp_len might be feasible until recently, but it appears to be -- this is a lot simpler than other approaches might be. I already implemented a rudimentary, POC encoding scheme for int4, but text is the datatype that I'd expect to see a real benefit for. While this POC patch of mine could easily have bugs, it is at least true that "make check-world" passes for me. For nbtree, reclaiming the lp_len space was just a matter of teaching a small number of existing places to go to the IndexTuple to get a tuple's length, rather than trusting an ItemId's lp_len field (that is, rather than calling ItemIdGetLength()). Most nbtree related code already gets the length from the index tuple header today, since it's pretty rare to care about the length of a tuple but not its contents. This did require updating some generic routines within bufpage.c, too, but that wasn't so bad. Can you suggest a workload/hardware to assess the benefits of an optimization like this, Andres? Is there a profile available somewhere in the archives that shows many cache misses within _bt_binsrch()? -- Peter Geoghegan
В списке pgsql-hackers по дате отправления: