Re: WIP: Fast GiST index build
От | Heikki Linnakangas |
---|---|
Тема | Re: WIP: Fast GiST index build |
Дата | |
Msg-id | 4E4A5CD5.2040601@enterprisedb.com обсуждение исходный текст |
Ответ на | Re: WIP: Fast GiST index build (Alexander Korotkov <aekorotkov@gmail.com>) |
Ответы |
Re: WIP: Fast GiST index build
|
Список | pgsql-hackers |
Looking at the calculation of levelStep: > + /* > + * Calculate levelStep by available amount of memory. We should be able to > + * load into main memory one page of each underlying node buffer (which > + * are in levelStep below). That give constraint over > + * maintenance_work_mem. Also we should be able to have subtree of > + * levelStep level in cache. That give constraint over > + * effective_cache_size. > + * > + * i'th underlying level of sub-tree can consists of > + * i^maxIndexTuplesPerPage pages at maximum. So, subtree of levelStep > + * levels can't be greater then 2 * maxIndexTuplesPerPage ^ levelStep > + * pages. We use some more reserve due to we probably can't take whole > + * effective cache and use formula 4 * maxIndexTuplesPerPage ^ levelStep = > + * effectiveCache. We use similar logic with maintenance_work_mem. We > + * should be able to store at least last pages of all buffers where we are > + * emptying current buffer to. > + */ > + effectiveMemory = Min(maintenance_work_mem * 1024 / BLCKSZ, > + effective_cache_size); > + levelStep = (int) log((double) effectiveMemory / 4.0) / > + log((double) maxIndexTuplesPerPage); > + I can see that that's equal to the formula given in the paper, log_B(M/4B), but I couldn't see any explanation for that formula in the paper. Your explanation makes sense, but where did it come from? It seems a bit pessimistic: while it's true that the a subtree can't be larger than 2 * maxIndexTuplesPerPage ^ levelStep, you can put a tighter upper bound on it. The max size of a subtree of depth n can be calculated as the geometric series: r^0 + r^1 + r^2 + ... + r^n = (1 - r^(n + 1)) / (1 - r) where r = maxIndexTuplesPerPage. For r=2 those formulas are equal, but for a large r and small n (which is typical), the 2 * maxIndexTuplesPerPage^levelStep formula gives a value that's almost twice as large as the real max size of a subtree. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
В списке pgsql-hackers по дате отправления: