Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
От | Anastasia Lubennikova |
---|---|
Тема | Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index. |
Дата | |
Msg-id | 4ab6e2db-bcee-f4cf-0916-3a06e6ccbb55@postgrespro.ru обсуждение исходный текст |
Ответ на | Re: [WIP] Effective storage of duplicates in B-tree index. (Anastasia Lubennikova <a.lubennikova@postgrespro.ru>) |
Ответы |
Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index. |
Список | pgsql-hackers |
The new version of the patch is attached. This version is even simpler than the previous one, thanks to the recent btree design changes and all the feedback I received. I consider it ready for review and testing. [feature overview] This patch implements the deduplication of btree non-pivot tuples on leaf pages in a manner similar to GIN index "posting lists". Non-pivot posting tuple has following format: t_tid | t_info | key values | posting_list[] Where t_tid and t_info fields are used to store meta info about tuple's posting list. posting list is an array of ItemPointerData. Currently, compression is applied to all indexes except system indexes, unique indexes, and indexes with included columns. On insertion, compression applied not to each tuple, but to the page before split. If the target page is full, we try to compress it. [benchmark results] idx ON tbl(c1); index contains 10000000 integer values i - number of distinct values in the index. So i=1 means that all rows have the same key, and i=10000000 means that all keys are different. i / old size (MB) / new size (MB) 1 215 88 1000 215 90 100000 215 71 10000000 214 214 For more, see the attached diagram with test results. [future work] Many things can be improved in this feature. Personally, I'd prefer to keep this patch as small as possible and work on other improvements after a basic part is committed. Though, I understand that some of these can be considered essential for this patch to be approved. 1. Implement a split of the posting tuples on a page split. 2. Implement microvacuum of posting tuples. 3. Add a flag into pg_index, which allows enabling/disabling compression for a particular index. 4. Implement posting list compression. -- Anastasia Lubennikova Postgres Professional:http://www.postgrespro.com The Russian Postgres Company
Вложения
В списке pgsql-hackers по дате отправления: