Re: _mdfd_getseg can be expensive

Поиск

Список

Период

Сортировка

От	Andres Freund
Тема	Re: _mdfd_getseg can be expensive
Дата	31 октября 2014 г. 22:32:26
Msg-id	20141031223210.GN13584@awork2.anarazel.de обсуждение исходный текст
Ответ на	_mdfd_getseg can be expensive (Andres Freund <andres@2ndquadrant.com>)
Ответы	Re: _mdfd_getseg can be expensive
Список	pgsql-hackers

Дерево обсуждения

Hi,

On 2014-03-31 12:10:01 +0200, Andres Freund wrote:
> I recently have seen some perf profiles in which _mdfd_getseg() was in
> the top #3 when VACUUMing large (~200GB) relations. Called by mdread(),
> mdwrite(). Looking at it's implementation, I am not surprised. It
> iterates over all segment entries a relations has; for every read or
> write. That's not painful for smaller relations, but at a couple of
> hundred GB it starts to be annoying. Especially if kernel readahead has
> already read in all data from disk.
>
> I don't have a good idea what to do about this yet, but it seems like
> something that should be fixed mid-term.
>
> The best I can come up is is caching the last mdvec used, but that's
> fairly ugly. Alternatively it might be a good idea to not store MdfdVec
> as a linked list, but as a densely allocated array.

I've seen this a couple times more since. On larger relations it gets
even more pronounced. When sequentially scanning a 2TB relation,
_mdfd_getseg() gets up to 80% proportionate CPU time towards the end of
the scan.

I wrote the attached patch that get rids of that essentially quadratic
behaviour, by replacing the mdfd chain/singly linked list with an
array. Since we seldomly grow files by a whole segment I can't see the
slightly bigger memory reallocations matter significantly. In pretty
much every other case the array is bound to be a winner.

Does anybody have fundamental arguments against that idea?

With some additional work we could save a bit more memory by getting rid
of the mdfd_segno as it's essentially redundant - but that's not
entirely trivial and I'm unsure if it's worth it.

I've also attached a second patch that makes PageIsVerified() noticeably
faster when the page is new. That's helpful and related because it makes
it easier to test the correctness of the md.c rewrite by faking full 1GB
segments. It's also pretty simple, so there's imo little reason not to
do it.

Greetings,

Andres Freund

PS:
  The research leading to these results has received funding from the
  European Union's Seventh Framework Programme (FP7/2007-2013) under
  grant agreement n° 318633

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Вложения

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: _mdfd_getseg can be expensive

Вложения