Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin
От | Melanie Plageman |
---|---|
Тема | Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin |
Дата | |
Msg-id | CAAKRu_Yh1aawR0RuRnzczGJfkUXE1oVV-+qsjL66srziV4vc-w@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin (Heikki Linnakangas <hlinnaka@iki.fi>) |
Ответы |
Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin
|
Список | pgsql-hackers |
On Mon, Jun 24, 2024 at 4:27 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote: > > On 21/06/2024 03:02, Peter Geoghegan wrote: > > On Thu, Jun 20, 2024 at 7:42 PM Melanie Plageman > > <melanieplageman@gmail.com> wrote: > > > >> The repro forces a round of index vacuuming after the standby > >> reconnects and before pruning a dead tuple whose xmax is older than > >> OldestXmin. > >> > >> At the end of the round of index vacuuming, _bt_pendingfsm_finalize() > >> calls GetOldestNonRemovableTransactionId(), thereby updating the > >> backend's GlobalVisState and moving maybe_needed backwards. > > > > Right. I saw details exactly consistent with this when I used GDB > > against a production instance. > > > > I'm glad that you were able to come up with a repro that involves > > exactly the same basic elements, including index page deletion. > > Would it be possible to make it robust so that we could always run it > with "make check"? This seems like an important corner case to > regression test. I'd have to look into how to ensure I can stabilize some of the parts that seem prone to flaking. I can probably stabilize the vacuum bit with a query of pg_stat_activity making sure it is waiting to acquire the cleanup lock. I don't, however, see a good way around the large amount of data required to trigger more than one round of index vacuuming. I could generate the data more efficiently than I am doing here (generate_series() in the from clause). Perhaps with a copy? I know it is too slow now to go in an ongoing test, but I don't have an intuition around how fast it would have to be to be acceptable. Is there a set of additional tests that are slower that we don't always run? I didn't follow how the wraparound test ended up, but that seems like one that would have been slow. - Melanie
В списке pgsql-hackers по дате отправления: