Re: [HACKERS] lazy_update_relstats considered harmful (was Re:
От | Matthew T. O'Connor |
---|---|
Тема | Re: [HACKERS] lazy_update_relstats considered harmful (was Re: |
Дата | |
Msg-id | 42447EBB.20905@zeut.net обсуждение исходный текст |
Ответ на | lazy_update_relstats considered harmful (was Re: pg_autovacuum not having enough suction ?) (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-performance |
Tom Lane wrote: >I wrote: > > >>One thing that is possibly relevant here is that in 8.0 a plain VACUUM >>doesn't set reltuples to the exactly correct number, but to an >>interpolated value that reflects our estimate of the "steady state" >>average between vacuums. I wonder if that code is wrong, or if it's >>operating as designed but is confusing autovac. >> >> > >Now that I think it over, I'm thinking that I must have been suffering >severe brain fade the day I wrote lazy_update_relstats() (see >vacuumlazy.c). The numbers that that routine is averaging are the pre- >and post-vacuum physical tuple counts. But the difference between them >consists of known-dead tuples, and we shouldn't be factoring dead tuples >into reltuples. The planner has always considered reltuples to count >only live tuples, and I think this is correct on two grounds: > >1. The numbers of tuples estimated to be returned by scans certainly >shouldn't count dead ones. > >2. Dead tuples don't have that much influence on scan costs either, at >least not once they are marked as known-dead. Certainly they shouldn't >be charged at full freight. > >It's possible that there'd be some value in adding a column to pg_class >to record dead tuple count, but given what we have now, the calculation >in lazy_update_relstats is totally wrong. > >The idea I was trying to capture is that the tuple density is at a >minimum right after VACUUM, and will increase as free space is filled >in until the next VACUUM, so that recording the exact tuple count >underestimates the number of tuples that will be seen on-the-average. >But I'm not sure that idea really holds water. The only way that a >table can be at "steady state" over a long period is if the number of >live tuples remains roughly constant (ie, inserts balance deletes). >What actually increases and decreases over a VACUUM cycle is the density >of *dead* tuples ... but per the above arguments this isn't something >we should adjust reltuples for. > >So I'm thinking lazy_update_relstats should be ripped out and we should >go back to recording just the actual stats. > >Sound reasonable? Or was I right the first time and suffering brain >fade today? >
В списке pgsql-performance по дате отправления: