Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin
От | Tomas Vondra |
---|---|
Тема | Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin |
Дата | |
Msg-id | 881304a6-6b8b-42a7-ad7c-261c3eace4a4@enterprisedb.com обсуждение исходный текст |
Ответ на | Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin (Melanie Plageman <melanieplageman@gmail.com>) |
Список | pgsql-hackers |
On 6/24/24 16:53, Melanie Plageman wrote: > On Mon, Jun 24, 2024 at 4:27 AM Heikki Linnakangas <hlinnaka@iki.fi> wrote: >> >> On 21/06/2024 03:02, Peter Geoghegan wrote: >>> On Thu, Jun 20, 2024 at 7:42 PM Melanie Plageman >>> <melanieplageman@gmail.com> wrote: >>> >>>> The repro forces a round of index vacuuming after the standby >>>> reconnects and before pruning a dead tuple whose xmax is older than >>>> OldestXmin. >>>> >>>> At the end of the round of index vacuuming, _bt_pendingfsm_finalize() >>>> calls GetOldestNonRemovableTransactionId(), thereby updating the >>>> backend's GlobalVisState and moving maybe_needed backwards. >>> >>> Right. I saw details exactly consistent with this when I used GDB >>> against a production instance. >>> >>> I'm glad that you were able to come up with a repro that involves >>> exactly the same basic elements, including index page deletion. >> >> Would it be possible to make it robust so that we could always run it >> with "make check"? This seems like an important corner case to >> regression test. > > I'd have to look into how to ensure I can stabilize some of the parts > that seem prone to flaking. I can probably stabilize the vacuum bit > with a query of pg_stat_activity making sure it is waiting to acquire > the cleanup lock. > > I don't, however, see a good way around the large amount of data > required to trigger more than one round of index vacuuming. I could > generate the data more efficiently than I am doing here > (generate_series() in the from clause). Perhaps with a copy? I know it > is too slow now to go in an ongoing test, but I don't have an > intuition around how fast it would have to be to be acceptable. Is > there a set of additional tests that are slower that we don't always > run? I didn't follow how the wraparound test ended up, but that seems > like one that would have been slow. > I think it depends on what is the impact on the 'make check' duration. If it could be added to one of the existing test groups, then it depends on how long the slowest test in that group is. If the new test needs to be in a separate group, it probably needs to be very fast. But I was wondering how much time are we talking about, so I tried creating a table, filling it with 300k rows => 250ms creating an index => 180ms delete 90% data => 200ms vacuum t => 130ms which with m_w_m=1MB does two rounds of index cleanup. That's ~760ms. I'm not sure how much more stuff does the test need to do, but this would be pretty reasonable, if we could add it to an existing group. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления: