Poorly thought out code in vacuum

Поиск

Список

Период

Сортировка

От	Tom Lane
Тема	Poorly thought out code in vacuum
Дата	5 января 2012 г. 20:38:12
Msg-id	25422.1325810273@sss.pgh.pa.us обсуждение исходный текст
Ответы	Re: Poorly thought out code in vacuum Re: Poorly thought out code in vacuum
Список	pgsql-hackers

Дерево обсуждения

Lately I have noticed buildfarm member jaguar occasionally failing
regression tests with
ERROR:  invalid memory alloc request size 1073741824
during a VACUUM, as for example at
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jaguar&dt=2012-01-04%2023%3A05%3A02

Naturally I supposed it to be a consequence of the CLOBBER_CACHE_ALWAYS
option, ie, somebody accessing an already-freed cache entry.  I managed
to duplicate and debug it here after an afternoon of trying, and it's
not actually related to that at all, except perhaps to the extent that
CLOBBER_CACHE_ALWAYS slows down some things enormously more than others.
The failure comes when a ResourceOwner tries to enlarge its
pinned-buffers array to more than 1GB, and that's not a typo: there are
umpteen entries for the same buffer, whose PrivateRefCount is 134217727
and trying to go higher.  A stack trace soon revealed the culprit, which
is this change in lazy_vacuum_heap():

@@ -932,7 +965,8 @@ lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats)        tblk =
ItemPointerGetBlockNumber(&vacrelstats->dead_tuples[tupindex]);       buf = ReadBufferExtended(onerel, MAIN_FORKNUM,
tblk,RBM_NORMAL,                                 vac_strategy);
 
-        LockBufferForCleanup(buf);
+        if (!ConditionalLockBufferForCleanup(buf))
+            continue;        tupindex = lazy_vacuum_page(onerel, tblk, buf, tupindex, vacrelstats);        /* Now that
we'vecompacted the page, record its available space */
 

introduced in
http://git.postgresql.org/gitweb/?p=postgresql.git&a=commitdiff&h=bbb6e559c4ea0fb4c346beda76736451dc24eb4e

The "continue" just results in another iteration of the containing tight
loop, and of course that immediately re-pins the same buffer without
having un-pinned it.  So if someone else sits on their pin of the buffer
for long enough, kaboom.

We could fix the direct symptom by inserting UnlockReleaseBuffer()
in front of the continue, but AFAICS that just makes this into a
busy-waiting equivalent of waiting unconditionally, so I don't really
see why we should not revert the above fragment altogether.  However,
I suppose Robert had something more intelligent in mind than a tight
loop when the buffer can't be exclusively locked, so maybe there is
some other change that should be made here instead.
        regards, tom lane

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Poorly thought out code in vacuum