Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements

Поиск
Список
Период
Сортировка
От Michail Nikolaev
Тема Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements
Дата
Msg-id CANtu0oizNtPUrPB0Mh+2vyjdijTX=LZvO5_dZN3+NqvE-CFPtw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements  (Matthias van de Meent <boekewurm+postgres@gmail.com>)
Ответы Re: Revisiting {CREATE INDEX, REINDEX} CONCURRENTLY improvements  (Matthias van de Meent <boekewurm+postgres@gmail.com>)
Список pgsql-hackers
Hello!

> I've thought about alternative solutions, too: how about getting a new snapshot every so often?
> We don't really care about the liveness of the already-scanned data; the snapshots used for RIC
> are used only during the scan. C/RIC's relation's lock level means vacuum can't run to clean up
> dead line items, so as long as we only swap the backend's reported snapshot (thus xmin) while
> the scan is between pages we should be able to reduce the time C/RIC is the one backend
> holding back cleanup of old tuples.

Hm, it looks like an interesting idea! It may be more dangerous, but
at least it feels much more elegant than an LP_DEAD-related way.
Also, feels like we may apply this to both phases (first and the second scans).
The original patch (1) was helping only to the second one (after call
to set_indexsafe_procflags).

But for the first scan we allowed to do so only for non-unique indexes
because of:

> * The reason for doing that is to avoid
> * bogus unique-index failures due to concurrent UPDATEs (we might see
> * different versions of the same row as being valid when we pass over them,
> * if we used HeapTupleSatisfiesVacuum).  This leaves us with an index that
> * does not contain any tuples added to the table while we built the index.

Also, (1) was limited to indexes without expressions and predicates
(2) because such may execute queries to other tables (sic!).
One possible solution is to add some checks to make sure no
user-defined functions are used.
But as far as I understand, it affects only CIC for now and does not
affect the ability to use the proposed technique (updating snapshot
time to time).

However, I think we need some more-less formal proof it is safe - it
is really challenging to keep all the possible cases in the head. I’ll
try to do something here.
Another possible issue may be caused by the new locking pattern - we
will be required to wait for all transaction started before the ending
of the phase to exit.

[1]: https://postgr.es/m/20210115133858.GA18931@alvherre.pgsql
[2]:
https://www.postgresql.org/message-id/flat/CAAaqYe_tq_Mtd9tdeGDsgQh%2BwMvouithAmcOXvCbLaH2PPGHvA%40mail.gmail.com#cbe3997b75c189c3713f243e25121c20



В списке pgsql-hackers по дате отправления:

Предыдущее
От: "David E. Wheeler"
Дата:
Сообщение: GIN-Indexable JSON Patterns
Следующее
От: Tom Lane
Дата:
Сообщение: Re: JSON Path and GIN Questions