Re: GiST VACUUM
От | Andrey Borodin |
---|---|
Тема | Re: GiST VACUUM |
Дата | |
Msg-id | 5F278E38-FD10-45EB-88D0-5B1D7108F984@yandex-team.ru обсуждение исходный текст |
Ответ на | Re: GiST VACUUM (Heikki Linnakangas <hlinnaka@iki.fi>) |
Ответы |
Re: GiST VACUUM
|
Список | pgsql-hackers |
> 13 июля 2018 г., в 18:10, Heikki Linnakangas <hlinnaka@iki.fi> написал(а): > >>>> But the situation in gistdoinsert(), where you encounter a deleted leaf page, could happen during normal operation,if vacuum runs concurrently with an insert. Insertion locks only one page at a time, as it descends the tree, soafter it has released the lock on the parent, but before it has locked the child, vacuum might have deleted the page. Inthe latest patch, you're checking for that just before swapping the shared lock for an exclusive one, but I think that'swrong; you need to check for that after swapping the lock, because otherwise vacuum might delete the page while you'renot holding the lock. >>> Looks like a valid concern, I'll move that code again. >> Done. > > Ok, the comment now says: > >> + /* >> + * Leaf pages can be left deleted but still referenced in case of >> + * crash during VACUUM's gistbulkdelete() >> + */ > > But that's not accurate, right? You should never see deleted pages after a crash, because the parent is updated in thesame WAL record as the child page, right? Fixed the comment. > > I'm still a bit scared about using pd_prune_xid to store the XID that prevents recycling the page too early. Can we usesome field in GISTPageOpaqueData for that, similar to how the B-tree stores it in BTPageOpaqueData? There is no room in opaque data, but, technically all page is just a tombstone until reused, so we can pick arbitrary place.PFA v7 where xid after delete is stored in opaque data, but we can use other places, like line pointer array or opaque-1 > 13 июля 2018 г., в 18:25, Heikki Linnakangas <hlinnaka@iki.fi> написал(а): > > Looking at the second patch, to scan the GiST index in physical order, that seems totally unsafe, if there are any concurrentpage splits. In the logical scan, pushStackIfSplited() deals with that, by comparing the page's NSN with the parent'sLSN. But I don't see anything like that in the physical scan code. Leaf page can be pointed by internal page and rightlink simultaneously. The purpose of NSN is to visit this page exactlyonce by following only on of two links in a scan. This is achieved naturally if we read everything from the beginningto the end. (That is how I understand, I can be wrong) > I think we can do something similar in the physical scan: remember the current LSN position at the beginning of the vacuum,and compare with that. The B-tree code uses the "cycle ID" for similar purposes. > > Do we still need the separate gistvacuumcleanup() pass, if we scan the index in physical order in the bulkdelete pass already? We do not need to gather stats there, but we are doing RecordFreeIndexPage() and IndexFreeSpaceMapVacuum(). Is it correctto move this to first scan? Best regards, Andrey Borodin.
Вложения
В списке pgsql-hackers по дате отправления: