Re: bug in fast-path locking

Поиск

Список

Период

Сортировка

От	Boszormenyi Zoltan
Тема	Re: bug in fast-path locking
Дата	10 апреля 2012 г. 03:56:30
Msg-id	4F83D93E.10606@cybertec.at обсуждение исходный текст
Ответ на	Re: bug in fast-path locking (Robert Haas <robertmhaas@gmail.com>)
Список	pgsql-hackers

Дерево обсуждения

2012-04-09 19:32 keltezéssel, Robert Haas írta:
> On Sun, Apr 8, 2012 at 9:37 PM, Robert Haas<robertmhaas@gmail.com>  wrote:
>>> Robert, the Assert triggering with the above procedure
>>> is in your "fast path" locking code with current GIT.
>> Yes, that sure looks like a bug.  It seems that if the top-level
>> transaction is aborting, then LockReleaseAll() is called and
>> everything gets cleaned up properly; or if a subtransaction is
>> aborting after the lock is fully granted, then the locks held by the
>> subtransaction are released one at a time using LockRelease(), but if
>> the subtransaction is aborted *during the lock wait* then we only do
>> LockWaitCancel(), which doesn't clean up the LOCALLOCK.  Before the
>> fast-lock patch, that didn't really matter, but now it does, because
>> that LOCALLOCK is tracking the fact that we're holding onto a shared
>> resource - the strong lock count.  So I think that LockWaitCancel()
>> needs some kind of adjustment, but I haven't figured out exactly what
>> it is yet.
> I looked at this more.  The above analysis is basically correct, but
> the problem goes a bit beyond an error in LockWaitCancel().  We could
> also crap out with an error before getting as far as LockWaitCancel()
> and have the same problem.  I think that a correct statement of the
> problem is this: from the time we bump the strong lock count, up until
> the time we're done acquiring the lock (or give up on acquiring it),
> we need to have an error-cleanup hook in place that will unbump the
> strong lock count if we error out.   Once we're done updating the
> shared and local lock tables, the special handling ceases to be
> needed, because any subsequent lock release will go through
> LockRelease() or LockReleaseAll(), which will do the appropriate
> clenaup.
>
> The attached patch is an attempt at implementing that; any reviews appreciated.

This patch indeed fixes the scenario discovered by Cousin Marc.

Reading this patch also made me realize that my lock_timeout
patch needs adjusting, i.e. needs an AbortStrongLockAcquire()
call if waiting for a lock timed out.

Best regards,
Zoltán Böszörményi

--
----------------------------------
Zoltán Böszörményi
Cybertec Schönig&  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de     http://www.postgresql.at/

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: bug in fast-path locking