Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
От | Peter Geoghegan |
---|---|
Тема | Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index. |
Дата | |
Msg-id | CAH2-Wzna4A3ZCdq2F+rqCZk9kX9ubPWbsPs2QYg-EtzQ7HU0Ng@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index. (Peter Geoghegan <pg@bowt.ie>) |
Ответы |
Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
|
Список | pgsql-hackers |
On Mon, Nov 4, 2019 at 11:52 AM Peter Geoghegan <pg@bowt.ie> wrote: > Attached is v21, which fixes some bitrot -- v20 of the patch was made > totally unusable by today's commit 8557a6f1. Other changes: There is more bitrot, so I attach v22. This also has some new changes centered around fixing particular issues with space utilization. These changes are: * nbtsort.c now intelligently considers the contribution of suffix truncation of posting list tuples when considering whether or not a leaf page is "full". I mean "full" in the sense that it has exceeded the soft limit (fillfactor-wise limit) on space utilization for the page (no change in how the hard limit in _bt_buildadd() works). We don't usually bother predicting the space saving from suffix truncation when considering split points, even in nbtsplitloc.c, but it's worth making an exception for posting lists (actually, this is the same exception that nbtsplitloc.c already had in much earlier versions of the patch). Posting lists are very often large enough to really make a big contribution to how balanced free space is. I now observe that weird cases where CREATE INDEX packs leaf pages too empty (or too full) are now all but eliminated. CREATE INDEX now does a pretty good job of respecting leaf fillfactor, while also allowing deduplication to be very effective (CREATE INDEX did neither of these two things in earlier versions of the patch). * Added "single value" strategy for retail insert deduplication -- this is closely related to nbtsplitloc.c's single value strategy. The general idea is that _bt_dedup_one_page() anticipates that a future "single value" page split is likely to occur, and therefore limits deduplication after two "1/3 of a page"-wide posting lists at the start of the page. It arranges for deduplication to leave a neat split point for nbtsplitloc.c to use when the time comes. In other words, the patch now allows "single value" page splits to leave leaf pages BTREE_SINGLEVAL_FILLFACTOR% full, just like v12/master. Leaving a small amount of free space on pages that are packed full of duplicates is always a good idea. Also, we no longer force page splits to leave pages 2/3 full (only two large posting lists plus a high key), which sometimes happened with v21. On balance, this change seems to slightly improve space utilization. In general, it's now unusual for retail insertions to get better space utilization than CREATE INDEX -- in that sense normality/balance has been restored in v22. Actually, I wrote the v22 changes by working through a list of weird space utilization issues from my personal notes. I'm pretty sure I've fixed all of those -- only nbtsplitloc.c's single value strategy wants to split at a point that leaves a heap TID in the new high key for the page, so that's the only thing we need to worry about within nbtdedup.c. * "deduplication" storage parameter now has psql completion. I intend to push the datum_image_eq() preparatory patch soon. I will also push a commit that makes _bt_keep_natts_fast() use datum_image_eq() separately. Anybody have an opinion on that? -- Peter Geoghegan
Вложения
В списке pgsql-hackers по дате отправления: