Re: [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum
От | Tom Lane |
---|---|
Тема | Re: [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum |
Дата | |
Msg-id | 22926.1306343067@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum (Robert Haas <robertmhaas@gmail.com>) |
Ответы |
Re: [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum
|
Список | pgsql-hackers |
Robert Haas <robertmhaas@gmail.com> writes: > On Wed, May 25, 2011 at 12:41 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Yeah, I had been thinking about the latter point. �We could be >> conservative and just keep the reported tuple density the same (ie, >> update relpages to the new correct value, while setting reltuples so >> that the density ratio doesn't change). �But that has its own problems >> when the table contents *do* change. �What I'm currently imagining is >> to do a smoothed moving average, where we factor in the new density >> estimate with a weight dependent on the percentage of the table we did >> scan. �That is, the calculation goes something like >> >> old_density = old_reltuples / old_relpages >> new_density = counted_tuples / scanned_pages >> reliability = scanned_pages / new_relpages >> updated_density = old_density + (new_density - old_density) * reliability >> new_reltuples = updated_density * new_relpages >> >> We could slow the moving-average convergence even further when >> reliability is small; for instance if you were really paranoid you might >> want to use the square of reliability in line 4. �That might be >> overdoing it, though. > I don't know. That's maybe better, but I'd be willing to wager that > in some cases it will just slow down the rate at which we converge to > a completely incorrect value, while in other cases it'll fail to > update the data when it really has changed. [ shrug... ] When you don't have complete information, it's *always* the case that you will sometimes make a mistake. That's not justification for paralysis, especially not when the existing code is demonstrably broken. What occurs to me after thinking a bit more is that the existing tuple density is likely to be only an estimate, too (one coming from the last ANALYZE, which could very well have scanned even less of the table than VACUUM did). So what I now think is that both VACUUM and ANALYZE ought to use a calculation of the above form to arrive at a new value for pg_class.reltuples. In both cases it would be pretty easy to track the number of pages we looked at while counting tuples, so the same raw information is available. > I am wondering, though, why we're not just inserting a special-purpose > hack for TOAST tables. Because the problem is not specific to TOAST tables. As things currently stand, we will accept the word of an ANALYZE as gospel even if it scanned 1% of the table, and completely ignore the results from a VACUUM even if it scanned 99% of the table. This is not sane. regards, tom lane
В списке pgsql-hackers по дате отправления: