Re: BUG #17401: REINDEX TABLE CONCURRENTLY creates a race condition on a streaming replica
От | Michael Paquier |
---|---|
Тема | Re: BUG #17401: REINDEX TABLE CONCURRENTLY creates a race condition on a streaming replica |
Дата | |
Msg-id | YmDVWJt4WjRKFvAV@paquier.xyz обсуждение исходный текст |
Ответ на | Re: BUG #17401: REINDEX TABLE CONCURRENTLY creates a race condition on a streaming replica (Andrey Borodin <x4mmm@yandex-team.ru>) |
Ответы |
Re: BUG #17401: REINDEX TABLE CONCURRENTLY creates a race condition on a streaming replica
Re: BUG #17401: REINDEX TABLE CONCURRENTLY creates a race condition on a streaming replica |
Список | pgsql-bugs |
On Wed, Apr 20, 2022 at 11:23:43PM +0500, Andrey Borodin wrote: > Cool! Can you please share it? Sure. I am finishing for this part with the attached, that would be backpatchable. Based on the analysis of upthread, I have stuck with logging standby snapshots once we are done in WaitForOlderSnapshots() so as we make sure that a standby does not attempt to look at the data of any running transactions it should wait for and the log of AELs at the end of WaitForLockersMultiple() to avoid the access of the lockers too early. I have to admit that a direct use of LogAccessExclusiveLock() feels a bit funky but we cannot use LockAcquire() AFAIK, and that should be fine with all the lock tags at hand. I have stressed quite a lot a primary/standby setup with flavors of the tests from Ben upthread, and I am not seeing the cache problems anymore on the standby, with concurrent CREATE/DROP INDEX or REINDEX sequences. > I did not succeed with my tests on this way. But probably I was > doing something wrong. Or, perhaps, these tests were encountering > second problem. There is a second thing going on with amcheck here. REINDEX CONCURRENTLY in isolation is working correctly when you run bt_index_check() in parallel of it (aka attach gdb to the backend process running REINDEX CONCURRENTLY and run bt_index_check() repeatedly in parallel at the various steps with a \watch or a new connection each time), but depending on your transaction order you may finish with an invalid index given as input of bt_index_check_internal(), roughly: - REINDEX CONCURRENTLY begins, goes up to the transaction doing the swap where the old index is marked as indisvalid. - Execute bt_index_check(), that refers to the old index OID, still valid at the moment where amcheck begins checking this index, before the Relation of the parent table is opened. - The transaction doing the swap commits, making visible that indisvalid is false for the old index. - bt_index_check() continues, opens the parent Relation, and complains about an invalid index. amcheck could be made more robust here, by calling RelationGetIndexList() after opening the Relation of the parent to check if it is still listed in the returned list as it would look at the relcache and discard any invalid indexes on the way. Returning an error for an invalid index seems less helpful to me here than doing nothing, particularly if you begin doing a full check of the indexes based on the contents of pg_index. -- Michael
Вложения
В списке pgsql-bugs по дате отправления: