Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune()
От | Peter Geoghegan |
---|---|
Тема | Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune() |
Дата | |
Msg-id | CAH2-Wzm0DXvLxzCqdiuN7=BwrXWRcm_KTU2VK2aNuo0PqCLNaA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune() (Matthias van de Meent <boekewurm+postgres@gmail.com>) |
Ответы |
Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune()
|
Список | pgsql-bugs |
On Wed, Nov 3, 2021 at 8:46 AM Matthias van de Meent <boekewurm+postgres@gmail.com> wrote: > I seem to repeatedly get backends of which the xmin is set from > InvalidTransactionId to some value < min(ProcGlobal->xids), which then > result in shared_oldest_nonremovable (and others) being less than the > value of their previous iteration. This leads to the infinite loop in > lazy_scan_prune (it stores and uses one value of > *_oldest_nonremovable, whereas heap_page_prune uses a more up-to-date > variant). > I noticed that when this happens, generally a parallel vacuum worker > is involved. Hmm. That is plausible. The way that VACUUM (and concurrent index builds) avoid being seen via the PROC_IN_VACUUM thing is pretty delicate. Wouldn't surprise me if the parallel VACUUM issue subtly broke lazy_scan_prune in the way that we see here. What about testing? Can we find a simple way of reducing this complicated repro to a less complicated repro with a failing assertion? Maybe an assertion that we get to keep after the bug is fixed? -- Peter Geoghegan
В списке pgsql-bugs по дате отправления: