Re: [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum

Поиск

Список

Период

Сортировка

От	Tom Lane
Тема	Re: [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum
Дата	25 мая 2011 г. 14:04:39
Msg-id	22926.1306343067@sss.pgh.pa.us обсуждение исходный текст
Ответ на	Re: [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum (Robert Haas <robertmhaas@gmail.com>)
Ответы	Re: [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum
Список	pgsql-hackers

Дерево обсуждения

Robert Haas <robertmhaas@gmail.com> writes:
> On Wed, May 25, 2011 at 12:41 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Yeah, I had been thinking about the latter point. �We could be
>> conservative and just keep the reported tuple density the same (ie,
>> update relpages to the new correct value, while setting reltuples so
>> that the density ratio doesn't change). �But that has its own problems
>> when the table contents *do* change. �What I'm currently imagining is
>> to do a smoothed moving average, where we factor in the new density
>> estimate with a weight dependent on the percentage of the table we did
>> scan. �That is, the calculation goes something like
>> 
>> old_density = old_reltuples / old_relpages
>> new_density = counted_tuples / scanned_pages
>> reliability = scanned_pages / new_relpages
>> updated_density = old_density + (new_density - old_density) * reliability
>> new_reltuples = updated_density * new_relpages
>> 
>> We could slow the moving-average convergence even further when
>> reliability is small; for instance if you were really paranoid you might
>> want to use the square of reliability in line 4. �That might be
>> overdoing it, though.

> I don't know.  That's maybe better, but I'd be willing to wager that
> in some cases it will just slow down the rate at which we converge to
> a completely incorrect value, while in other cases it'll fail to
> update the data when it really has changed.

[ shrug... ]  When you don't have complete information, it's *always*
the case that you will sometimes make a mistake.  That's not
justification for paralysis, especially not when the existing code is
demonstrably broken.

What occurs to me after thinking a bit more is that the existing tuple
density is likely to be only an estimate, too (one coming from the last
ANALYZE, which could very well have scanned even less of the table than
VACUUM did).  So what I now think is that both VACUUM and ANALYZE ought
to use a calculation of the above form to arrive at a new value for
pg_class.reltuples.  In both cases it would be pretty easy to track the
number of pages we looked at while counting tuples, so the same raw
information is available.

> I am wondering, though, why we're not just inserting a special-purpose
> hack for TOAST tables.

Because the problem is not specific to TOAST tables.  As things
currently stand, we will accept the word of an ANALYZE as gospel even if
it scanned 1% of the table, and completely ignore the results from a
VACUUM even if it scanned 99% of the table.  This is not sane.
        regards, tom lane

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: [ADMIN] pg_class reltuples/relpages not updated by autovacuum/vacuum