Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
От | Anastasia Lubennikova |
---|---|
Тема | Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index. |
Дата | |
Msg-id | 2702638f-3120-8a06-4075-357533a8b198@postgrespro.ru обсуждение исходный текст |
Ответ на | Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index. (Anastasia Lubennikova <a.lubennikova@postgrespro.ru>) |
Ответы |
Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.
|
Список | pgsql-hackers |
17.07.2019 19:36, Anastasia Lubennikova: > > There is one major issue left - preserving TID order in posting lists. > For a start, I added a sort into BTreeFormPostingTuple function. > It turned out to be not very helpful, because we cannot check this > invariant lazily. > > Now I work on patching _bt_binsrch_insert() and _bt_insertonpg() to > implement > insertion into the middle of the posting list. I'll send a new version > this week. Patch 0002 (must be applied on top of 0001) implements preserving of correct TID order inside posting list when inserting new tuples. This version passes all regression tests including amcheck test. I also used following script to test insertion into the posting list: set client_min_messages to debug4; drop table tbl; create table tbl (i1 int, i2 int); insert into tbl select 1, i from generate_series(0,1000) as i; insert into tbl select 1, i from generate_series(0,1000) as i; create index idx on tbl (i1); delete from tbl where i2 <500; vacuum tbl ; insert into tbl select 1, i from generate_series(1001, 1500) as i; The last insert triggers several insertions that can be seen in debug messages. I suppose it is not the final version of the patch yet, so I left some debug messages and TODO comments to ease review. Please, in your review, pay particular attention to usage of BTreeTupleGetHeapTID. For posting tuples it returns the first tid from posting list like BTreeTupleGetMinTID, but maybe some callers are not ready for that and want BTreeTupleGetMaxTID instead. Incorrect usage of these macros may cause some subtle bugs, which are probably not covered by tests. So, please double-check it. Next week I'm going to check performance and try to find specific scenarios where this feature can lead to degradation and measure it, to understand if we need to make this deduplication optional. -- Anastasia Lubennikova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Вложения
В списке pgsql-hackers по дате отправления: