Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
От | Peter Geoghegan |
---|---|
Тема | Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index. |
Дата | |
Msg-id | CAH2-Wz=Tr6mxMsKRmv_=9-05_O9QWqOzQ8GweRV2DXS6+Y38QQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index. (Peter Geoghegan <pg@bowt.ie>) |
Ответы |
Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
|
Список | pgsql-hackers |
On Fri, Jan 10, 2020 at 1:36 PM Peter Geoghegan <pg@bowt.ie> wrote: > Still, v29 doesn't resolve the following points you've raised, where I > haven't reached a final opinion on what to do myself. These items are > as follows (I'm quoting your modified patch file sent on January 8th > here): Still no progress on these items, but I am now posting v30. A new version seems warranted, because I now want to revive a patch from a couple of years back as part of the deduplication project -- it would be good to get feedback on that sooner rather than later. This is a patch that you [Heikki] are already familiar with -- the patch to speed up compactify_tuples() [1]. Sokolov Yura is CC'd here, since he is the original author. The deduplication patch is much faster with this in place. For example, with v30: pg@regression:5432 [25216]=# create unlogged table foo(bar int4); CREATE TABLE pg@regression:5432 [25216]=# create index unlogged_foo_idx on foo(bar); CREATE INDEX pg@regression:5432 [25216]=# insert into foo select g from generate_series(1, 1000000) g, generate_series(1,10) i; INSERT 0 10000000 Time: 17842.455 ms (00:17.842) If I revert the "Bucket sort for compactify_tuples" commit locally, then the same insert statement takes 31.614 seconds! In other words, the insert statement is made ~77% faster by that commit alone. The improvement is stable and reproducible. Clearly there is a big compactify_tuples() bottleneck that comes from PageIndexMultiDelete(). The hot spot is quite visible with "perf top -e branch-misses". The compactify_tuples() patch stalled because it wasn't clear if it was worth the trouble at the time. It was originally written to address a much smaller PageRepairFragmentation() bottleneck in heap pruning. ISTM that deduplication alone is a good enough reason to commit this patch. I haven't really changed anything about the 2017/2018 patch -- I need to do more review of that. We probably don't need to qsort() inlining stuff (the bucket sort thing is the real win), but I included it in v30 all the same. Other changes in v30: * We now avoid extra _bt_compare() calls within _bt_check_unique() -- no need to call _bt_compare() once per TID (once per equal tuple is quite enough). This is a noticeable performance win, even though the change was originally intended to make the logic in _bt_check_unique() clearer. * Reduced the limit on the size of a posting list tuple to 1/6 of a page -- down from 1/3. This seems like a good idea on the grounds that it keeps our options open if we split a page full of duplicates due to UPDATEs rather than INSERTs (i.e. we split a page full of duplicates that isn't also the rightmost page among pages that store only those duplicates). A lower limit is more conservative, and yet doesn't cost us that much space. * Refined nbtsort.c/CREATE INDEX to work sensibly with non-standard fillfactor settings. This last item is a minor bugfix, really. [1] https://commitfest.postgresql.org/14/1138/ -- Peter Geoghegan
Вложения
В списке pgsql-hackers по дате отправления: