On Saturday, June 1, 2013, Robert Haas wrote:
I agree with all that. I don't have any data either, but I agree that
AFAICT it seems to mostly be a problem for large (terabyte-scale)
databases, or ones that are dreadfully short of I/O bandwidth. AWS,
I'm looking at you.
It would be interesting to make a list of what other issues people
have seen using PostgreSQL on very large data sets. Complaints I've
heard include:
1. Inexplicable failure of the planner to use indexes on very large
tables, preferring an obviously-stupid sequential scan. This might be
fixed by the latest index-size fudge factor work.
2. Lack of concurrent DDL.
On VACUUM and ANALYZE specifically, I'd have to say that the most
common problems I encounter are (a) anti-wraparound vacuums kicking in
at inconvenient times and eating up too many resources and (b) users
making ridiculous settings changes to avoid the problems caused by
anti-wraparound vacuums kicking in at inconvenient times and eating up
too many resources.
Do we know why anti-wraparound uses so many resources in the first place? The default settings seem to be quite conservative to me, even for a system that has only a single 5400 rpm hdd (and even more so for any real production system that would be used for a many-GB database).
I wonder if there is something simple but currently unknown going on which is causing it to damage performance out of all proportion to the resources it ought to be using.
Cheers,
Jeff