Re: Partial vacuum versus pg_class.reltuples
От | Tom Lane |
---|---|
Тема | Re: Partial vacuum versus pg_class.reltuples |
Дата | |
Msg-id | 17119.1244402651@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: Partial vacuum versus pg_class.reltuples (Robert Haas <robertmhaas@gmail.com>) |
Ответы |
Re: Partial vacuum versus pg_class.reltuples
|
Список | pgsql-hackers |
Robert Haas <robertmhaas@gmail.com> writes: > Am I wrong to be frightened by the implications of updating this value > only once in a blue moon? It's not great, but I think it's probably not catastrophic either. Keep in mind that all we need from reltuples is that the ratio reltuples/relpages be a reasonable estimate of the density of live tuples, because what the planner actually uses is GetRelationNumberOfBlocks() * reltuples / relpages. So for example an append-only table isn't a big problem, even if it's been quite a while since we updated reltuples and relpages. There was some mention of having a partial vacuum extrapolate a value of reltuples and update pg_class with that. I'm afraid that that could be a seriously bad idea; because there is no very good reason to suppose that the subset of recently-modified pages forms a good sample of the whole table as far as live-tuple density goes. [ thinks a bit and reads the code some more ... ] There is a considerably safer alternative, which is to let ANALYZE update the reltuples estimate based on the pages it sampled; which should be a considerably less biased sample than the pages a partial vacuum would have looked at. And we have already got the code doing that, either in a standalone ANALYZE or an ANALYZE attached to a vacuum that turned out to be partial. So actually I think we are in better shape than I realized, so far as the heap reltuples numbers go. The case that's a bit nasty is where we are propagating the heap reltuples number to the index reltuples number for a GIN index. (Remember this only matters for a partial index.) As the code stands today, what we'll be propagating is the reltuples estimate from the most recent ANALYZE, not the ANALYZE that we might be about to conduct. This is not great; but considering that we are completely ignoring the first-order problem of the partial index predicate's selectivity, quibbling about a second-order effect like the estimate being out of date is pretty pointless. > Do we have any reasonable manual way of forcing > VACUUM to scan the entire heap? You could use VACUUM FREEZE, for instance. regards, tom lane
В списке pgsql-hackers по дате отправления: