Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune()
От | Peter Geoghegan |
---|---|
Тема | Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune() |
Дата | |
Msg-id | CAH2-Wz=zLcnZO8MqPXQLqOLY=CAwQhdvs5Ncg6qMb5nMAam0EA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune() (Noah Misch <noah@leadboat.com>) |
Ответы |
Re: BUG #17257: (auto)vacuum hangs within lazy_scan_prune()
|
Список | pgsql-bugs |
On Wed, Jan 10, 2024 at 2:38 PM Noah Misch <noah@leadboat.com> wrote: > I don't know. That particular system experienced the infinite loop only once. While I certainly can't recreate the problem on demand, it has been seen on this same application far more than once. > > I'm referring to calls such as the > > "GetOldestNonRemovableTransactionId(NULL)" and > > "GlobalVisCheckRemovableFullXid()" calls that take place inside > > _bt_pendingfsm_finalize(). It's not like we do stuff like that in very > > many other places. > > I see what you mean about the rarity and potential importance of > "GetOldestNonRemovableTransactionId(NULL)". There's just one other caller, > vac_update_datfrozenxid(), which calls it for an unrelated cause. I just noticed another detail that adds significant weight to this theory: it looks like the problem is hit on the first tuple located on the first heap page that VACUUM scans *after* it completes its first round of index vacuuming (I'm inferring this from vacrel state, particular its lpdead_items instrumentation counter). The dead_items array is as large as possible here (just under 1 GiB), and lpdead_items is 178956692 (which uses up all of our dead_items space). VACUUM scans tens of gigabytes of heap pages before it begins this initial round of index vacuuming (according to vacrel->scanned_pages). What are the chances that all of this is just a coincidence? Low, I'd say. -- Peter Geoghegan
В списке pgsql-bugs по дате отправления: