Re: [WIP] Effective storage of duplicates in B-tree index.

Поиск

Список

Период

Сортировка

От	Alexandr Popov
Тема	Re: [WIP] Effective storage of duplicates in B-tree index.
Дата	23 марта 2016 г. 15:30:19
Msg-id	56F2B686.9070602@postgrespro.ru обсуждение исходный текст
Ответ на	Re: [WIP] Effective storage of duplicates in B-tree index. (Anastasia Lubennikova <a.lubennikova@postgrespro.ru>)
Список	pgsql-hackers

Дерево обсуждения

<br /><br /><div class="moz-cite-prefix">On 18.03.2016 20:19, Anastasia Lubennikova wrote:<br /></div><blockquote
cite="mid:56EC38A9.9030303@postgrespro.ru"type="cite">Please, find the new version of the patch attached. Now it has
WALfunctionality. <br /><br /> Detailed description of the feature you can find in README draft <a
class="moz-txt-link-freetext"href="https://goo.gl/50O8Q0">https://goo.gl/50O8Q0</a><br /><br /> This patch is pretty
complicated,so I ask everyone, who interested in this feature, <br /> to help with reviewing and testing it. I will be
gratefulfor any feedback. <br /> But please, don't complain about code style, it is still work in progress. <br /><br
/>Next things I'm going to do: <br /> 1. More debugging and testing. I'm going to attach in next message couple of sql
scriptsfor testing. <br /> 2. Fix NULLs processing <br /> 3. Add a flag into pg_index, that allows to enable/disable
compressionfor each particular index. <br /> 4. Recheck locking considerations. I tried to write code as less invasive
aspossible, but we need to make sure that algorithm is still correct. <br /> 5. Change BTMaxItemSize <br /> 6. Bring
backmicrovacuum functionality. <br /><br /></blockquote><br /><br /> Hi, hackers.<br /><br /> It's my first review, so
donot be strict to me.<br /><br /> I have tested this patch on the next table:<br /> create table message<br />    
(<br/>         id        serial,<br />         usr_id        integer,<br />         text        text<br />     );<br />
CREATEINDEX message_usr_id ON message (usr_id);<br /> The table has 10000000 records.<br /><br /> I found the
following:<br/> The less unique keys the less size of the table.<br /><br /> Next 2 tablas demonstrates it. <br /> New
B-tree<br /> Count of unique keys (usr_id), index“s size , time of creation<br /> 10000000    ;"214 MB"   
;"00:00:34.193441"<br/> 3333333      ;"214 MB"    ;"00:00:45.731173"<br /> 2000000      ;"129 MB"   
;"00:00:41.445876"<br/> 1000000      ;"129 MB"    ;"00:00:38.455616"<br /> 100000        ;"86 MB"     
;"00:00:40.887626"<br/> 10000          ;"79 MB"      ;"00:00:47.199774"<br /><br /> Old B-tree <br /> Count of unique
keys(usr_id), index“s size , time of creation<br /> 10000000    ;"214 MB"    ;"00:00:35.043677"<br /> 3333333     
;"286MB"    ;"00:00:40.922845"<br /> 2000000      ;"300 MB"    ;"00:00:46.454846"<br /> 1000000      ;"278 MB"   
;"00:00:42.323525"<br/> 100000        ;"287 MB"    ;"00:00:47.438132"<br /> 10000          ;"280 MB"   
;"00:01:00.307873"<br/><br /> I inserted data  randomly and sequentially, it did not influence the index's size.<br />
Timeof select, insert and update random rows is not changed. It is great, but certainly it needs some more detailed
study.<br/>  <br /> Alexander Popov<br /> Postgres Professional: <a class="moz-txt-link-freetext"
href="http://www.postgrespro.com">http://www.postgrespro.com</a><br/> The Russian Postgres Company <br /><br /><br />

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: [WIP] Effective storage of duplicates in B-tree index.