Re: hanging for 30sec when checkpointing
От | gjm@caledoncard.com (Greg Mennie) |
---|---|
Тема | Re: hanging for 30sec when checkpointing |
Дата | |
Msg-id | a806dcd9.0402110625.3190f48c@posting.google.com обсуждение исходный текст |
Ответ на | hanging for 30sec when checkpointing (Shane Wright <me@shanewright.co.uk>) |
Ответы |
Re: hanging for 30sec when checkpointing
Re: hanging for 30sec when checkpointing |
Список | pgsql-admin |
me@shanewright.co.uk (Shane Wright) wrote in message news:<40202216.4010608@shanewright.co.uk>... > Hi, > > I'm running a reasonable sized (~30Gb) 7.3.4 database on Linux and I'm > getting some weird performance at times. > > When the db is under medium-heavy load, it periodically spawns a > 'checkpoint subprocess' which runs for between 15 seconds and a minute. > Ok, fair enough, the only problem is the whole box becomes pretty much > unresponsive during this time - from what I can gather it's because it > writes out roughly 1Mb (vmstat says ~1034 blocks) per second until its done. > > Other processes can continue to run (e.g. vmstat) but other things do > not (other queries, mostly running 'ps fax', etc). So everything gets > stacked up till the checkpoint finishes and all is well again, untill > the next time... I am having a similar problem and this is what I've found so far: During the checkpoint the volume of data that's written isn't very high and it goes on for a fairly long time (up to 20 seconds) at a rate that appears to be well below our disk array's potential. The volume of data written is usually 1-5 MB/sec on an array that we've tested to sustain over 50 MB/sec (sequential writes, of course). It turns out that what's going on is that the command queue for the RAID array (3Ware RAID card) is filling up during the checkpoint and is staying at the max (254 commands) for most of the checkpoint. The odd lucky insert appears to work, but is extremely slow. In our case, the WAL files are on the same array as the data files, so everything grinds to a halt. The machine we're running it on is a dual processor box with 2GB RAM. Since most database read operations are being satisfied from the cache, reading processes don't seem to be affected during the pauses. I suspect that increasing the checkpoint frequency could help, since the burst of commands on the disk channel would be shorter. (it's currently 300 seconds) I have found that the checkpoint after a vacuum is the worst. This was the original problem which led to the investigation. Besides more frequent checkpoints, I am at a loss as to what to do about this. Any help would be appreciated. Thanks, Greg
В списке pgsql-admin по дате отправления: