Re: pgsql: Avoid race in RelationBuildDesc() affecting CREATE INDEX CONCURR
От | Tomas Vondra |
---|---|
Тема | Re: pgsql: Avoid race in RelationBuildDesc() affecting CREATE INDEX CONCURR |
Дата | |
Msg-id | df7b4c0b-7d92-f03f-75c4-9e08b269a716@enterprisedb.com обсуждение исходный текст |
Ответ на | pgsql: Avoid race in RelationBuildDesc() affecting CREATE INDEX CONCURR (Noah Misch <noah@leadboat.com>) |
Ответы |
Re: pgsql: Avoid race in RelationBuildDesc() affecting CREATE INDEX CONCURR
|
Список | pgsql-committers |
On 10/24/21 03:40, Noah Misch wrote: > Avoid race in RelationBuildDesc() affecting CREATE INDEX CONCURRENTLY. > > CIC and REINDEX CONCURRENTLY assume backends see their catalog changes > no later than each backend's next transaction start. That failed to > hold when a backend absorbed a relevant invalidation in the middle of > running RelationBuildDesc() on the CIC index. Queries that use the > resulting index can silently fail to find rows. Fix this for future > index builds by making RelationBuildDesc() loop until it finishes > without accepting a relevant invalidation. It may be necessary to > reindex to recover from past occurrences; REINDEX CONCURRENTLY suffices. > Back-patch to 9.6 (all supported versions). > > Noah Misch and Andrey Borodin, reviewed (in earlier versions) by Andres > Freund. > > Discussion: https://postgr.es/m/20210730022548.GA1940096@gust.leadboat.com > Unfortunately, this seems to have broken CLOBBER_CACHE_ALWAYS builds. Since this commit, initdb never completes due to infinite retrying over and over (on the first RelationBuildDesc call). We have a CLOBBER_CACHE_ALWAYS buildfarm machine "avocet", and that currently looks like this (top): PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2626 buildfa+ 20 0 202888 21416 20084 R 98.34 0.531 151507:16 /home/buildfarm/avocet/buildroot/REL9_6_STABLE/pgsql.build/tmp_install/home/buildfarm/avocet/buildroot/REL9_6_STABLE/inst/bin/postgres --boot -x1 -F Yep, that's 151507 minutes, i.e. 104 days in initdb :-/ I haven't looked at this very closely yet, but it seems the whole problem is we do this at the very beginning: in_progress_list[in_progress_offset].invalidated = false; /* * find the tuple in pg_class corresponding to the given relation id */ pg_class_tuple = ScanPgRelation(targetRelId, true, false); which seems entirely self-defeating, because ScanPgRelation acquires a lock (on pg_class), which accepts invalidations, which invalidates system caches (in clobber_cache_always), which sets promptly sets in_progress_list[in_progress_offset].invalidated = false; guaranteeing an infinite loop. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-committers по дате отправления: