Re: btree: implement dynamic prefix truncation (was: Improving btree performance through specializing by key shape, take 2)

Поиск
Список
Период
Сортировка
От Dilip Kumar
Тема Re: btree: implement dynamic prefix truncation (was: Improving btree performance through specializing by key shape, take 2)
Дата
Msg-id CAFiTN-v2=4b8MOs1rfWU0Y9cuptEPs7oekkhH4iSUUKXnzCBGg@mail.gmail.com
обсуждение исходный текст
Ответ на btree: implement dynamic prefix truncation (was: Improving btree performance through specializing by key shape, take 2)  (Matthias van de Meent <boekewurm+postgres@gmail.com>)
Ответы Re: btree: implement dynamic prefix truncation (was: Improving btree performance through specializing by key shape, take 2)  (Matthias van de Meent <boekewurm+postgres@gmail.com>)
Список pgsql-hackers
On Wed, Nov 1, 2023 at 2:42 AM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:
>
> Hi,
>
> Currently, nbtree code compares each and every column of an index
> tuple during the binary search on the index page. With large indexes
> that have many duplicate prefix column values (e.g. an index on (bool,
> bool, uuid) ) that means a lot of wasted time getting to the right
> column.
>
> The attached patch improves on that by doing per-page dynamic prefix
> truncation: If we know that on both the right and left side there are
> index tuples where the first two attributes are equal to the scan key,
> we skip comparing those attributes at the current index tuple and
> start with comparing attribute 3, saving two attribute compares. We
> gain performance whenever comparing prefixing attributes is expensive
> and when there are many tuples with a shared prefix - in unique
> indexes this doesn't gain much, but we also don't lose much in this
> case.
>
> This patch was originally suggested at [0], but it was mentioned that
> they could be pulled out into it's own thread. Earlier, the
> performance gains were not clearly there for just this patch, but
> after further benchmarking this patch stands on its own for
> performance: it sees no obvious degradation of performance, while
> gaining 0-5% across various normal indexes on the cc-complete sample
> dataset, with the current worst-case index shape getting a 60%+
> improved performance on INSERTs in the tests at [0].

+1 for the idea, I have some initial comments while reading through the patch.

1.
Commit message refers to a non-existing reference '(see [0])'.


2.
+When we do a binary search on a sorted set (such as a BTree), we know that a
+tuple will be smaller than its left neighbour, and larger than its right
+neighbour.

I think this should be 'larger than left neighbour and smaller than
right neighbour' instead of the other way around.

3.
+With the above optimizations, dynamic prefix truncation improves the worst
+case complexity of indexing from O(tree_height * natts * log(tups_per_page))
+to O(tree_height * (3*natts + log(tups_per_page)))

Where do the 3*natts come from?  Is it related to setting up the
dynamic prefix at each level?

4.
+ /*
+ * All tuple attributes are equal to the scan key, only later attributes
+ * could potentially not equal the scan key.
+ */
+ *compareattr = ntupatts + 1;

Can you elaborate on this more? If all tuple attributes are equal to
the scan key then what do those 'later attributes' mean?


--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: vignesh C
Дата:
Сообщение: Re: Commitfest manager January 2024
Следующее
От: Masahiko Sawada
Дата:
Сообщение: Re: Synchronizing slots from primary to standby