Re: pgsql: Avoid improbable PANIC during heap_update.

Поиск
Список
Период
Сортировка
От Jeff Davis
Тема Re: pgsql: Avoid improbable PANIC during heap_update.
Дата
Msg-id baa25a6fc7c725d28ec5d0b7787230180b45a275.camel@j-davis.com
обсуждение исходный текст
Ответ на Re: pgsql: Avoid improbable PANIC during heap_update.  (Peter Geoghegan <pg@bowt.ie>)
Ответы Re: pgsql: Avoid improbable PANIC during heap_update.  (Peter Geoghegan <pg@bowt.ie>)
Список pgsql-committers
On Fri, 2022-09-30 at 21:23 -0700, Peter Geoghegan wrote:
> 2490     page = BufferGetPage(buffer);
> 2491
> 2492     /*
> 2493      * Before locking the buffer, pin the visibility map page if
> it appears to
> 2494      * be necessary.  Since we haven't got the lock yet, someone
> else might be
> 2495      * in the middle of changing this, so we'll need to recheck
> after we have
> 2496      * the lock.
> 2497      */
> 2498     if (PageIsAllVisible(page))
> 2499         visibilitymap_pin(relation, block, &vmbuffer);
>
> So we're calling visibilitymap_pin() having just acquired a buffer
> pin
> on a heap page buffer for the first time, and without acquiring a
> buffer lock on the same heap page (we don't hold one now, and we've
> never held one at some earlier point).
>
> Without Jeff's bugfix, nothing stops heap_delete() from getting a pin
> on a heap page that happens to have already been cleanup locked by
> another session running VACUUM. The same session could then
> (correctly) observe that the page does not have PD_ALL_VISIBLE set --
> not yet. 

With you so far; I had considered this code path and was still unable
to repro.

> VACUUM might then set PD_ALL_VISIBLE, without having to
> acquire any kind of interlock that heap_delete() will reliably
> notice.
> After all, VACUUM had a cleanup lock before the other session's call
> to heap_delete() even began -- so the responsibility has to lie with
> heap_delete().

Directly after the code you reference above, there is (in 5f9dda4c06,
right before my patch):

 2501     LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 2502
 2503     /*
 2504      * If we didn't pin the visibility map page and the page has
become all
 2505      * visible while we were busy locking the buffer, we'll have
to unlock and
 2506      * re-lock, to avoid holding the buffer lock across an I/O.
That's a bit
 2507      * unfortunate, but hopefully shouldn't happen often.
 2508      */
 2509     if (vmbuffer == InvalidBuffer && PageIsAllVisible(page))
 2510     {
 2511         LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
 2512         visibilitymap_pin(relation, block, &vmbuffer);
 2513         LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 2514     }

Doesn't that deal with the case you brought up directly? heap_delete()
can't get the exclusive lock until VACUUM releases its cleanup lock, at
which point all-visible will be set. Then heap_delete() should notice
in line 2509 and then pin the VM buffer. Right?

And I don't think a similar issue exists when the locks are briefly
released on lines 2563 or 2606. The pin is held until after the VM bit
is cleared (aside from an error path and an early return):

 2489     buffer = ReadBuffer(relation, block);
 ...
 2717     if (PageIsAllVisible(page))
 2718     {
 2719         all_visible_cleared = true;
 2720         PageClearAllVisible(page);
 ...
 2835     ReleaseBuffer(buffer);

If VACCUM hasn't acquired the cleanup lock before 2489, it can't get
one until heap_delete() is done looking at the all-visible bit anyway.
So I don't see how my patch would have fixed it even if that was the
problem.

Of course, I must be wrong somewhere, because the bug seems to exist.

--
Jeff Davis
PostgreSQL Contributor Team - AWS





В списке pgsql-committers по дате отправления:

Предыдущее
От: Peter Eisentraut
Дата:
Сообщение: pgsql: Fix tiny memory leaks
Следующее
От: Peter Geoghegan
Дата:
Сообщение: Re: pgsql: Avoid improbable PANIC during heap_update.