Re: Hash partitioning.
От | Markus Wanner |
---|---|
Тема | Re: Hash partitioning. |
Дата | |
Msg-id | 51CC9532.2010807@bluegap.ch обсуждение исходный текст |
Ответ на | Re: Hash partitioning. (Nicolas Barbier <nicolas.barbier@gmail.com>) |
Список | pgsql-hackers |
On 06/27/2013 06:35 PM, Nicolas Barbier wrote: > I am assuming that this (comparatively very small and super-hot) index > is cached all the time, while for the other indexes (that are > supposedly super-huge) only the top part stays cached. > > I am mostly just trying to find out where Yuri’s “partitioning is > needed for super-huge tables” experience might come from, and noting > that Heikki’s argument might not be 100% valid. I think the OP made that clear by stating that his index has relatively low selectivity. That seems to be a case that Postgres doesn't handle very well. > I think that the > “PostgreSQL-answer” to this problem is to somehow cluster the data on > the “hotness column” (so that all hot rows are close to one another, > thereby improving the efficiency of the caching of relation blocks) + > partial indexes for the hot rows (as first mentioned by Claudio; to > improve the efficiency of the caching of index blocks). Agreed, sounds like a sane strategy. > My reasoning was: To determine which index block to update (typically > one in both the partitioned and non-partitioned cases), one needs to > walk the index first, which supposedly causes one additional (read) > I/O in the non-partitioned case on average, because there is one extra > level and the lower part of the index is not cached (because of the > size of the index). I think that pokes a hole in Heikki’s argument of > “it really doesn’t matter, partitioning == using one big table with > big non-partial indexes.” Heikki's argument holds for the general case, where you cannot assume a well defined hot partition. In that case, the lowest levels of all the b-trees of the partitions don't fit in the cache, either. A single index performs better in that case, because it has lower overhead. I take your point that in case you *can* define a hot partition and apply partitioning, the hot(test) index(es) are more likely to be cached and thus require less disk I/O. Regards Markus Wanner
В списке pgsql-hackers по дате отправления: