Table clustering idea
От | Dawid Kuroczko |
---|---|
Тема | Table clustering idea |
Дата | |
Msg-id | 758d5e7f0606251648h4d518ca6k7e1c511ba316bb8b@mail.gmail.com обсуждение исходный текст |
Список | pgsql-hackers |
There is a well known command called CLUSTER which organizes table<br />in specified index's order. It has a drawback, thatnew tuples added are<br />not in this order. Last night I had idea which could be interesting, I hope. <br /><br />Theidea is to make use of 'histogram_bounds' collected statistical data.<br />Instead of inserting row into first suitablespot in a table, a table would<br />be "divided" into sections, one for each of histogram_bounds ranges. <br />Wheninserting, the database would try to find most suitable section<br />to insert (using the histogram_bounds), and ifthere were free spots<br />there, would insert there. If not, it would either look for a tuple in nearby <br />sections,or first suitable place.<br /><br />What would it do? It would try to keep table somewhat organized,<br />keepingrows of similar values close together (within SET STATISTICS<br />resolution, so a common scenario would be 50 or100 "sections"). <br />It would make it a bit hard for a table to shrink (since new rows would<br />be added throughoutthe table, not at the beginning).<br /><br />Other idea than using histogram_bounds would be using the position<br/>of key inside the index to determine the "ideal" place of row inside <br />the table and find the closest freespot there. This would be of course<br />much more precise and wouldn't rely on statistic.<br /><br /> Regards,<br/> Dawid<br />
В списке pgsql-hackers по дате отправления: