Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
От | Peter Geoghegan |
---|---|
Тема | Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index. |
Дата | |
Msg-id | CAH2-Wzm=9TnAFGCDfvsBVC5zYonQqeLMmYpnx=xZ3nyXOeHjNA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index. (Peter Geoghegan <pg@bowt.ie>) |
Ответы |
Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
|
Список | pgsql-hackers |
On Mon, Sep 2, 2019 at 6:53 PM Peter Geoghegan <pg@bowt.ie> wrote: > Attach is v10, which fixes the Valgrind issue. Attached is v11, which makes the kill_prior_tuple optimization work with posting list tuples. The only catch is that it can only work when all "logical tuples" within a posting list are known-dead, since of course there is only one LP_DEAD bit available for each posting list. The hardest part of this kill_prior_tuple work was writing the new _bt_killitems() code, which I'm still not 100% happy with. Still, it seems to work well -- new pageinspect LP_DEAD status info was added to the second patch to verify that we're setting LP_DEAD bits as needed for posting list tuples. I also had to add a new nbtree-specific, posting-list-aware version of index_compute_xid_horizon_for_tuples() -- _bt_compute_xid_horizon_for_tuples(). Finally, it was necessary to avoid splitting a posting list with the LP_DEAD bit set. I took a naive approach to avoiding that problem, adding code to _bt_findinsertloc() to prevent it. Posting list splits are generally assumed to be rare, so the fact that this is slightly inefficient should be fine IMV. I also refactored deduplication itself in anticipation of making the WAL logging more efficient, and incremental. So, the structure of the code within _bt_dedup_one_page() was simplified, without really changing it very much (I think). I also fixed a bug in _bt_dedup_one_page(). The check for dead items was broken in previous versions, because the loop examined the high key tuple in every iteration. Making _bt_dedup_one_page() more efficient and incremental is still the most important open item for the patch. -- Peter Geoghegan
Вложения
В списке pgsql-hackers по дате отправления: