Re: BUG #17401: REINDEX TABLE CONCURRENTLY creates a race condition on a streaming replica

Поиск
Список
Период
Сортировка
От Michael Paquier
Тема Re: BUG #17401: REINDEX TABLE CONCURRENTLY creates a race condition on a streaming replica
Дата
Msg-id YmDVWJt4WjRKFvAV@paquier.xyz
обсуждение исходный текст
Ответ на Re: BUG #17401: REINDEX TABLE CONCURRENTLY creates a race condition on a streaming replica  (Andrey Borodin <x4mmm@yandex-team.ru>)
Ответы Re: BUG #17401: REINDEX TABLE CONCURRENTLY creates a race condition on a streaming replica  (Andres Freund <andres@anarazel.de>)
Re: BUG #17401: REINDEX TABLE CONCURRENTLY creates a race condition on a streaming replica  (Andrey Borodin <x4mmm@yandex-team.ru>)
Список pgsql-bugs
On Wed, Apr 20, 2022 at 11:23:43PM +0500, Andrey Borodin wrote:
> Cool! Can you please share it?

Sure.  I am finishing for this part with the attached, that would be
backpatchable.  Based on the analysis of upthread, I have stuck with
logging standby snapshots once we are done in WaitForOlderSnapshots()
so as we make sure that a standby does not attempt to look at the data
of any running transactions it should wait for and the log of AELs at
the end of WaitForLockersMultiple() to avoid the access of the lockers
too early.  I have to admit that a direct use of
LogAccessExclusiveLock() feels a bit funky but we cannot use
LockAcquire() AFAIK, and that should be fine with all the lock tags at
hand.

I have stressed quite a lot a primary/standby setup with flavors of
the tests from Ben upthread, and I am not seeing the cache problems
anymore on the standby, with concurrent CREATE/DROP INDEX or REINDEX
sequences.

> I did not succeed with my tests on this way. But probably I was
> doing something wrong. Or, perhaps, these tests were encountering
> second problem.

There is a second thing going on with amcheck here.  REINDEX
CONCURRENTLY in isolation is working correctly when you run
bt_index_check() in parallel of it (aka attach gdb to the backend
process running REINDEX CONCURRENTLY and run bt_index_check()
repeatedly in parallel at the various steps with a \watch or a new
connection each time), but depending on your transaction order you may
finish with an invalid index given as input of
bt_index_check_internal(), roughly:
- REINDEX CONCURRENTLY begins, goes up to the transaction doing the
swap where the old index is marked as indisvalid.
- Execute bt_index_check(), that refers to the old index OID, still
valid at the moment where amcheck begins checking this index, before
the Relation of the parent table is opened.
- The transaction doing the swap commits, making visible that
indisvalid is false for the old index.
- bt_index_check() continues, opens the parent Relation, and complains
about an invalid index.

amcheck could be made more robust here, by calling
RelationGetIndexList() after opening the Relation of the parent to
check if it is still listed in the returned list as it would look at
the relcache and discard any invalid indexes on the way.  Returning an
error for an invalid index seems less helpful to me here than doing
nothing, particularly if you begin doing a full check of the indexes
based on the contents of pg_index.
--
Michael

Вложения

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: Query generates infinite loop
Следующее
От: Andres Freund
Дата:
Сообщение: Re: BUG #17401: REINDEX TABLE CONCURRENTLY creates a race condition on a streaming replica