Re: Stopgap solution for table-size-estimate updating problem

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Stopgap solution for table-size-estimate updating problem
Дата
Msg-id 19266.1101667944@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Stopgap solution for table-size-estimate updating  (Simon Riggs <simon@2ndquadrant.com>)
Ответы Re: Stopgap solution for table-size-estimate updating  (Simon Riggs <simon@2ndquadrant.com>)
Список pgsql-hackers
Simon Riggs <simon@2ndquadrant.com> writes:
> On the topic of accuracy of the estimate: Updates cause additional data
> to be written to the table, so tables get bigger until vacuumed. Tables
> with many Inserts are also regularly trimmed with Deletes. With a
> relatively static workload and a regular vacuum cycle, the table size
> for many major tables eventually levels off, remaining roughly constant
> but the number of non-zero pages will vary over time in a saw-tooth
> curve. Estimating the cardinality by using the number of blocks would
> ignore the fact that many of them are empty for much of the time. That
> would then lead to a systematic over-estimate of the cardinality of the
> regularly updated tables.

You mean underestimate.  After a VACUUM, the tuples-per-page figure
would be set to a relatively low value, and then subsequent inserts
would fill in the free space, causing the actual density to rise
while the physical number of blocks stays more or less constant.
So the proposed method would always give an accurate number of blocks,
but it would tend to underestimate the number of tuples in a dynamic
situation.

Still, it's better than the current method, which is likely to
underestimate both parameters.  I believe that having an accurate block
count and an underestimated tuple count would tend to favor choosing
indexscans over seqscans, which is probably a good thing --- when was
the last time you saw someone complaining that the planner had
improperly chosen an indexscan over a seqscan?

> How easy would it be to take into account the length of the FSM for the
> relation also?

Don't think this would help; the FSM doesn't really track number of
tuples.  Free space isn't a good guide to number of tuples because you
can't distinguish inserts from updates at that level.  (I'm also a bit
concerned about turning the FSM into a source of contention.)
        regards, tom lane


В списке pgsql-hackers по дате отправления:

Предыдущее
От: Joe Conway
Дата:
Сообщение: Re: Status of server side Large Object support?
Следующее
От: Tom Lane
Дата:
Сообщение: Re: Fix for "q" with psql display paging dumps out of psql