Re: Hash partitioning.

Поиск

Список

Период

Сортировка

От	Markus Wanner
Тема	Re: Hash partitioning.
Дата	27 июня 2013 г. 19:42:19
Msg-id	51CC9532.2010807@bluegap.ch обсуждение исходный текст
Ответ на	Re: Hash partitioning. (Nicolas Barbier <nicolas.barbier@gmail.com>)
Список	pgsql-hackers

Дерево обсуждения

On 06/27/2013 06:35 PM, Nicolas Barbier wrote:
> I am assuming that this (comparatively very small and super-hot) index
> is cached all the time, while for the other indexes (that are
> supposedly super-huge) only the top part stays cached.
> 
> I am mostly just trying to find out where Yuri’s “partitioning is
> needed for super-huge tables” experience might come from, and noting
> that Heikki’s argument might not be 100% valid.

I think the OP made that clear by stating that his index has relatively
low selectivity. That seems to be a case that Postgres doesn't handle
very well.

> I think that the
> “PostgreSQL-answer” to this problem is to somehow cluster the data on
> the “hotness column” (so that all hot rows are close to one another,
> thereby improving the efficiency of the caching of relation blocks) +
> partial indexes for the hot rows (as first mentioned by Claudio; to
> improve the efficiency of the caching of index blocks).

Agreed, sounds like a sane strategy.

> My reasoning was: To determine which index block to update (typically
> one in both the partitioned and non-partitioned cases), one needs to
> walk the index first, which supposedly causes one additional (read)
> I/O in the non-partitioned case on average, because there is one extra
> level and the lower part of the index is not cached (because of the
> size of the index). I think that pokes a hole in Heikki’s argument of
> “it really doesn’t matter, partitioning == using one big table with
> big non-partial indexes.”

Heikki's argument holds for the general case, where you cannot assume a
well defined hot partition. In that case, the lowest levels of all the
b-trees of the partitions don't fit in the cache, either. A single index
performs better in that case, because it has lower overhead.

I take your point that in case you *can* define a hot partition and
apply partitioning, the hot(test) index(es) are more likely to be cached
and thus require less disk I/O.

Regards

Markus Wanner

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Hash partitioning.