Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

Поиск

Список

Период

Сортировка

От	Peter Geoghegan
Тема	Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
Дата	15 января 2020 г. 02:08:57
Msg-id	CAH2-Wz=Tr6mxMsKRmv_=9-05_O9QWqOzQ8GweRV2DXS6+Y38QQ@mail.gmail.com обсуждение исходный текст
Ответ на	Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index. (Peter Geoghegan <pg@bowt.ie>)
Ответы	Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
Список	pgsql-hackers

Дерево обсуждения

On Fri, Jan 10, 2020 at 1:36 PM Peter Geoghegan <pg@bowt.ie> wrote:
> Still, v29 doesn't resolve the following points you've raised, where I
> haven't reached a final opinion on what to do myself. These items are
> as follows (I'm quoting your modified patch file sent on January 8th
> here):

Still no progress on these items, but I am now posting v30. A new
version seems warranted, because I now want to revive a patch from a
couple of years back as part of the deduplication project -- it would
be good to get feedback on that sooner rather than later. This is a
patch that you [Heikki] are already familiar with -- the patch to
speed up compactify_tuples() [1]. Sokolov Yura is CC'd here, since he
is the original author.

The deduplication patch is much faster with this in place. For
example, with v30:

pg@regression:5432 [25216]=# create unlogged table foo(bar int4);
CREATE TABLE
pg@regression:5432 [25216]=# create index unlogged_foo_idx on foo(bar);
CREATE INDEX
pg@regression:5432 [25216]=# insert into foo select g from
generate_series(1, 1000000) g, generate_series(1,10) i;
INSERT 0 10000000
Time: 17842.455 ms (00:17.842)

If I revert the "Bucket sort for compactify_tuples" commit locally,
then the same insert statement takes 31.614 seconds! In other words,
the insert statement is made ~77% faster by that commit alone. The
improvement is stable and reproducible.

Clearly there is a big compactify_tuples() bottleneck that comes from
PageIndexMultiDelete(). The hot spot is quite visible with "perf top
-e branch-misses".

The compactify_tuples() patch stalled because it wasn't clear if it
was worth the trouble at the time. It was originally written to
address a much smaller PageRepairFragmentation() bottleneck in heap
pruning. ISTM that deduplication alone is a good enough reason to
commit this patch. I haven't really changed anything about the
2017/2018 patch -- I need to do more review of that. We probably don't
need to qsort() inlining stuff (the bucket sort thing is the real
win), but I included it in v30 all the same.

Other changes in v30:

* We now avoid extra _bt_compare() calls within _bt_check_unique() --
no need to call _bt_compare() once per TID (once per equal tuple is
quite enough).

This is a noticeable performance win, even though the change was
originally intended to make the logic in _bt_check_unique() clearer.

* Reduced the limit on the size of a posting list tuple to 1/6 of a
page -- down from 1/3.

This seems like a good idea on the grounds that it keeps our options
open if we split a page full of duplicates due to UPDATEs rather than
INSERTs (i.e. we split a page full of duplicates that isn't also the
rightmost page among pages that store only those duplicates). A lower
limit is more conservative, and yet doesn't cost us that much space.

* Refined nbtsort.c/CREATE INDEX to work sensibly with non-standard
fillfactor settings.

This last item is a minor bugfix, really.

[1] https://commitfest.postgresql.org/14/1138/
-- 
Peter Geoghegan

Вложения

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

Вложения