Matthew T. O'Connor wrote:
> Alvaro Herrera wrote:
> >Matthew T. O'Connor wrote:
> >>How can you determine what tables can be vacuumed within
> >>autovacuum_naptime?
> >
> >My assumption is that
> >pg_class.relpages * vacuum_cost_page_miss * vacuum_cost_delay = time to
> >vacuum
> >
> >This is of course not the reality, because the delay is not how long
> >it takes to fetch the pages. But it lets us have a value with which we
> >can do something. With the default values, vacuum_cost_delay=10,
> >vacuum_cost_page_miss=10, autovacuum_naptime=60s, we'll consider tables
> >of under 600 pages, 4800 kB (should we include indexes here in the
> >relpages count? My guess is no).
>
> I'm not sure how pg_class.relpages is maintained but what happens to a
> bloated table? For example, a 100 row table that is constantly updated
> and hasn't been vacuumed in a while (say the admin disabled autovacuum
> for a while), now that small 100 row table has 1000 pages in it most of
> which are just bloat, will we miss this table? Perhaps basing this on
> reltuples would be better?
Well, this would only happen the first time, until the plain worker
processed the table; next time it would be picked up by the hot table
worker. But yeah, we can build a better estimate using the same trick
the planner uses: estimate tuple density as reltuples/relpages times the
actual number of blocks on disk.
> >A table over 600 pages does not sound like a good candidate for hot, so
> >this seems more or less reasonable to me. On the other hand, maybe we
> >shouldn't tie this to the vacuum cost delay stuff.
>
> I'm not sure it's a good idea to tie this to the vacuum cost delay
> settings either, so let me as you this, how is this better than just
> allowing the admin to set a new GUC variable like
> autovacuum_hot_table_size_threshold (or something shorter) which we can
> assign a decent default of say 8MB.
Yeah, maybe that's better -- it's certainly simpler.
--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.