Re: another autovacuum scheduling thread
| От | David Rowley | 
|---|---|
| Тема | Re: another autovacuum scheduling thread | 
| Дата | |
| Msg-id | CAApHDvqe7ee=vobWe4GVAt2gm_H6eiGNZeo_dEMptvHYAkibBA@mail.gmail.com обсуждение исходный текст  | 
		
| Ответ на | Re: another autovacuum scheduling thread (Nathan Bossart <nathandbossart@gmail.com>) | 
| Ответы | 
                	
            		Re: another autovacuum scheduling thread
            		
            		 | 
		
| Список | pgsql-hackers | 
On Sat, 1 Nov 2025 at 09:12, Nathan Bossart <nathandbossart@gmail.com> wrote: > > On Thu, Oct 30, 2025 at 07:38:15PM -0500, Sami Imseih wrote: > > The results above show what I expected: the batch tables receive higher > > priority, as seen from the averages of autovacuum and autoanalyze runs. > > This behavior is expected, but it may catch some users by surprise after > > an upgrade, since certain tables will now receive more attention than > > others. Longer tests might also show more bloat accumulating on heavily > > updated tables. In such cases, a user may need to adjust autovacuum > > settings on a per-table basis to restore the previous behavior. > > Interesting. From these results, it almost sounds as if we're further > amplifying the intended effect of commit 06eae9e. That could be a good > thing. Something else I'm curious about is datfrozenxid, i.e., whether > prioritization keeps the database (M)XID ages lower. I wonder if it would be more realistic to throttle the work simulation to a certain speed with pgbench -R rather than having it go flat out. The results show that quite a bit higher "rows_inserted" for the batch_tables with the patched version. Sami didn't mention any changes to vacuum_cost_limit, so I suspect that autovacuum would be getting quite behind on this run, which isn't ideal. Rate limiting to something that the given vacuum_cost_limit could keep up with seems more realistic. The fact that the patched version did more insert work in the batch tables does seem a bit unfair as that gave autovacuum more work to do in the patched test run which would result in the lower-scoring tables being neglected more in the patched version. This makes me wonder if we should log the score of the table when the autovacuum starts for the table. We do calculate the score again in recheck_relation_needs_vacanalyze() just before doing the vacuum/analyze, so maybe the score can be stored in the autovac_table struct and displayed somewhere. Maybe along with the log_autovacuum_min_duration / log_autoanalyze_min_duration would be useful. It might be good in there for DBA analysis to give some visibility on how bad things got before autovacuum got around to working on a given table. If we logged the score, we could do the "unpatched" test with the patched code, just with commenting out the list_sort(tables_to_process, TableToProcessComparator); It'd then be interesting to zero the log_auto*_min_duration settings and review the order differences and how high the scores got. Would the average score be higher or lower with patched version? I'd guess lower since the higher scoring tables would tend to get vacuumed later with the unpatched version and their score would be even higher by the time autovacuum got to them. I think if the average score has gone down at the point that the vacuum starts, then that's a very good thing. Maybe we'd need to write a patch to recalculate the "tables_to_process" List after a table is vacuumed and autovacuum_naptime has elapsed for us to see this, else the priorities might have become too outdated. I'd expect that to be even more true when vacuum_cost_limit is configured too low. David
В списке pgsql-hackers по дате отправления: