Re: [GENERAL] Autovacuum Improvements
От | Steve Atkins |
---|---|
Тема | Re: [GENERAL] Autovacuum Improvements |
Дата | |
Msg-id | 46EC7F08-54DE-49AD-A546-CB2824E8092D@blighty.com обсуждение исходный текст |
Ответ на | Re: [GENERAL] Autovacuum Improvements (Richard Huxton <dev@archonet.com>) |
Список | pgsql-hackers |
On Jan 22, 2007, at 11:16 AM, Richard Huxton wrote: > Bruce Momjian wrote: >> Yep, agreed on the random I/O issue. The larger question is if >> you have >> a huge table, do you care to reclaim 3% of the table size, rather >> than >> just vacuum it when it gets to 10% dirty? I realize the vacuum is >> going >> to take a lot of time, but vacuuming to relaim 3% three times >> seems like >> it is going to be more expensive than just vacuuming the 10% >> once. And >> vacuuming to reclaim 1% ten times seems even more expensive. The >> partial vacuum idea is starting to look like a loser to me again. > > Buying a house with a 25-year mortgage is much more expensive than > just paying cash too, but you don't always have a choice. > > Surely the key benefit of the partial vacuuming thing is that you > can at least do something useful with a large table if a full > vacuum takes 24 hours and you only have 4 hours of idle I/O. > > It's also occurred to me that all the discussion of scheduling way > back when isn't directly addressing the issue. What most people > want (I'm guessing) is to vacuum *when the user-workload allows* > and the time-tabling is just a sysadmin first-approximation at that. Yup. I'd really like for my app to be able to say "Hmm. No interactive users at the moment, no critical background tasks. Now would be a really good time for the DB to do some maintenance." but also to be able to interrupt the maintenance process if some new users or other system load show up. > With partial vacuuming possible, we can arrange things with just > three thresholds and two measurements: > Measurement 1 = system workload > Measurement 2 = a per-table "requires vacuuming" value > Threshold 1 = workload at which we do more vacuuming > Threshold 2 = workload at which we do less vacuuming > Threshold 3 = point at which a table is considered worth vacuuming. > Once every 10 seconds, the manager compares the current workload to > the thresholds and starts a new vacuum, kills one or does nothing. > New vacuum processes keep getting started as long as there is > workload spare and tables that need vacuuming. > > Now the trick of course is how you measure system workload in a > meaningful manner. I'd settle for a "start maintenance", "stop maintenance" API. Anything else (for instance the heuristics you suggest above) would definitely be gravy. It's not going to be simple to do, though, I don't think. Cheers, Steve
В списке pgsql-hackers по дате отправления: