Re: Performance query about large tables, lots of concurrent access
От | Karl Wright |
---|---|
Тема | Re: Performance query about large tables, lots of concurrent access |
Дата | |
Msg-id | 4677E563.5070900@metacarta.com обсуждение исходный текст |
Ответ на | Re: Performance query about large tables, lots of concurrent access (Bill Moran <wmoran@collaborativefusion.com>) |
Список | pgsql-performance |
Bill Moran wrote: > In response to Karl Wright <kwright@metacarta.com>: > >> Alvaro Herrera wrote: >>> Karl Wright wrote: >>>> Alvaro Herrera wrote: >>>>> Karl Wright wrote: >>>>> >>>>>> This particular run lasted four days before a VACUUM became essential. >>>>>> The symptom that indicates that VACUUM is needed seems to be that the >>>>>> CPU usage of any given postgresql query skyrockets. Is this essentially >>>>>> correct? >>>>> Are you saying you weren't used to run VACUUM all the time? If so, >>>>> that's where the problem lies. >>>> Postgresql 7.4 VACUUM runs for so long that starting it with a cron job >>>> even every 24 hours caused multiple instances of VACUUM to eventually be >>>> running in my case. So I tried to find a VACUUM schedule that permitted >>>> each individual vacuum to finish before the next one started. A vacuum >>>> seemed to require 4-5 days with this particular database - or at least >>>> it did for 7.4. So I had the VACUUM schedule set to run every six days. >>> How large is the database? I must admit I have never seen a database >>> that took 4 days to vacuum. This could mean that your database is >>> humongous, or that the vacuum strategy is wrong for some reason. >> The database is humongus, and the machine is under intense load. On the >> instance where this long vacuum occurred, there were several large >> tables - one with 7,000,000 rows, one with 14,000,000, one with >> 140,000,000, and one with 250,000,000. > > Don't rule out the possibility that the only way to fix this _might_ be to > throw more hardware at it. Proper configuration can buy you a lot, but if > your usage is exceeding the available bandwidth of the IO subsystem, the > only way you're going to get better performance is to put in a faster IO > subsystem. > >>> You know that you can run vacuum on particular tables, right? It would >>> be probably a good idea to run vacuum on the most updated tables, and >>> leave alone those that are not or little updated (hopefully the biggest; >>> this would mean that an almost-complete vacuum run would take much less >>> than a whole day). >> Yeah, sorry, that doesn't apply here. > > Why not? I see no reason why an appropriate autovaccum schedule would not > apply to your scenario. I'm not saying it does, only that your response > does not indicate that it doesn't, and thus I'm concerned that you're > writing autovacuum off without proper research. > I'm not writing off autovacuum - just the concept that the large tables aren't the ones that are changing. Unfortunately, they *are* the most dynamically updated. Karl
В списке pgsql-performance по дате отправления: