Re: LSM tree for Postgres
От | Konstantin Knizhnik |
---|---|
Тема | Re: LSM tree for Postgres |
Дата | |
Msg-id | 7c0f264f-8e13-321b-8035-9c930682966e@postgrespro.ru обсуждение исходный текст |
Ответ на | Re: LSM tree for Postgres (Alexander Korotkov <aekorotkov@gmail.com>) |
Ответы |
Re: LSM tree for Postgres
|
Список | pgsql-hackers |
On 07.08.2020 15:31, Alexander Korotkov wrote: > ср, 5 авг. 2020 г., 09:13 Konstantin Knizhnik <k.knizhnik@postgrespro.ru>: >> Concerning degrade of basic index - B-Tree itself is balanced tree. Yes, >> insertion of random keys can cause split of B-Tree page. >> In the worst case half of B-Tree page will be empty. So B-Tree size will >> be two times larger than ideal tree. >> It may cause degrade up to two times. But that is all. There should not >> be infinite degrade of speed tending to zero. > My concerns are not just about space utilization. My main concern is > about the order of the pages. After the first merge the base index > will be filled in key order. So physical page ordering perfectly > matches their logical ordering. After the second merge some pages of > base index splits, and new pages are added to the end of the index. > Splits also happen in key order. So, now physical and logical > orderings match within two extents corresponding to first and second > merges, but not within the whole tree. While there are only few such > extents, disk page reads may in fact be mostly sequential, thanks to > OS cache and readahead. But finally, after many merges, we can end up > with mostly random page reads. For instance, leveldb doesn't have a > problem of ordering degradation, because it stores levels in sorted > files. > I agree with your that loosing sequential order of B-Tree pages may have negative impact on performance. But it first of all critical for order-by and range queries, when we should traverse several subsequent leave pages. It is less critical for exact-search or delete/insert operations. Efficiency of merge operations mostly depends on how much keys will be stored at the same B-Tree page. And it is first of all determined by size of top index and key distribution.
В списке pgsql-hackers по дате отправления: