Re: BUG #17401: REINDEX TABLE CONCURRENTLY creates a race condition on a streaming replica
От | Michael Paquier |
---|---|
Тема | Re: BUG #17401: REINDEX TABLE CONCURRENTLY creates a race condition on a streaming replica |
Дата | |
Msg-id | YmYn46HoX4DeWtds@paquier.xyz обсуждение исходный текст |
Ответ на | Re: BUG #17401: REINDEX TABLE CONCURRENTLY creates a race condition on a streaming replica (Andres Freund <andres@anarazel.de>) |
Список | pgsql-bugs |
On Thu, Apr 21, 2022 at 09:17:10AM -0700, Andres Freund wrote: > On 2022-04-21 12:54:00 +0900, Michael Paquier wrote: >> Based on the analysis of upthread, I have stuck with logging standby >> snapshots once we are done in WaitForOlderSnapshots() so as we make sure >> that a standby does not attempt to look at the data of any running >> transactions it should wait for and the log of AELs at the end of >> WaitForLockersMultiple() to avoid the access of the lockers too early. > > Why is this necessary? The comments in WaitForOlderSnapshots() don't really > explain it. This is pretty darn expensive. I think that my brain here thought about the standby attempting to access tuples that could have been deleted before the reference snapshot was taken for the index validation in phase 3, and that logging the current snapshot would help in detecting conflicts for that. But that won't help much unless the conflicts themselves are logged in some way. Is that right? It looks like this comes down to log more information when we wait for some of the snapshots in VirtualXactLock(). That would be new. >> amcheck could be made more robust here, by calling >> RelationGetIndexList() after opening the Relation of the parent to >> check if it is still listed in the returned list as it would look at >> the relcache and discard any invalid indexes on the way. > > That seems like a weird approach. Why can't the check just be done on the > relcache entry of the index itself? If that doesn't work, something is still > broken in cache invalidation. As far as I looked at (aka checking step by step the concurrent REINDEX and DROP INDEX flows), the cache looks fine with the invalidations when the index's indisvalid is switched on/off (I am taking into account the end of index_concurrently_build() that should not need one). FWIW, I would be tempted to do something like the attached for amcheck, where we skip invalid indexes. Thoughts welcome. -- Michael
Вложения
В списке pgsql-bugs по дате отправления: