Обсуждение: proper tuning for restoring from pg_dump in 8.3.7
I am restoring a fairly sizable database from a pg_dump file (COPY FROM STDIN style of data) -- the pg_dump file is ~40G. My system has 4 cores, and 12G of RAM. I drop, then recreate the database, and I do this restore via a: cat dumpfile | psql db_name. The trouble is that my system free memory (according to top) goes to about 60M, which causes all operations on the server to grind to a halt, and this 40G restore will take a couple hours to complete. I noted that the restore file doesn't do anything inappropriate such as creating indices BEFORE adding the data or anything - thus I can only suspect that my trouble has to do with performance tuning ineptitude in postgresql.conf. My settings (ones that I have changed): shared_buffers = 512MB temp_buffers = 512MB work_mem = 256MB maintenance_work_mem = 64MB max_fsm_pages = 655360 vacuum_cost_page_hit = 3 Any insight would be most appreciated. r.b. Robert W. Burgholzer Surface Water Modeler Office of Water Supply and Planning Virginia Department of Environmental Quality rwburgholzer@deq.virginia.gov 804-698-4405 Open Source Modeling Tools: http://sourceforge.net/projects/npsource/
On Wed, 2010-07-14 at 12:33 -0400, Burgholzer, Robert (DEQ) wrote: > I am restoring a fairly sizable database from a pg_dump file (COPY FROM > STDIN style of data) -- the pg_dump file is ~40G. > > My system has 4 cores, and 12G of RAM. I drop, then recreate the > database, and I do this restore via a: cat dumpfile | psql db_name. The > trouble is that my system free memory (according to top) goes to about > 60M, which causes all operations on the server to grind to a halt, and > this 40G restore will take a couple hours to complete. > > I noted that the restore file doesn't do anything inappropriate such as > creating indices BEFORE adding the data or anything - thus I can only > suspect that my trouble has to do with performance tuning ineptitude in > postgresql.conf. The best you will get is ~ 22G an hour. If this is a backup you can take again in a different format, use -Fc and then use parallel restore. Even if half of the database is one table, you will still knock the restore time by 50% or so. Joshua D. Drake -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579 Consulting, Training, Support, Custom Development, Engineering
That said, the time to restore is explainable, but is there something in my tuning that is causing all of my memory to be eaten? We seem to have some undiagnosed issue whereby opening and closing large files on the system leaves a lot in the cache -- I am guessing that this is my culprit and NOT my pg tuning? Thanks again, r.b. Robert W. Burgholzer Surface Water Modeler Office of Water Supply and Planning Virginia Department of Environmental Quality rwburgholzer@deq.virginia.gov 804-698-4405 Open Source Modeling Tools: http://sourceforge.net/projects/npsource/ -----Original Message----- From: Joshua D. Drake [mailto:jd@commandprompt.com] Sent: Wednesday, July 14, 2010 12:58 PM To: Burgholzer, Robert (DEQ) Cc: pgsql-admin@postgresql.org Subject: Re: [ADMIN] proper tuning for restoring from pg_dump in 8.3.7 On Wed, 2010-07-14 at 12:33 -0400, Burgholzer, Robert (DEQ) wrote: > I am restoring a fairly sizable database from a pg_dump file (COPY FROM > STDIN style of data) -- the pg_dump file is ~40G. > > My system has 4 cores, and 12G of RAM. I drop, then recreate the > database, and I do this restore via a: cat dumpfile | psql db_name. The > trouble is that my system free memory (according to top) goes to about > 60M, which causes all operations on the server to grind to a halt, and > this 40G restore will take a couple hours to complete. > > I noted that the restore file doesn't do anything inappropriate such as > creating indices BEFORE adding the data or anything - thus I can only > suspect that my trouble has to do with performance tuning ineptitude in > postgresql.conf. The best you will get is ~ 22G an hour. If this is a backup you can take again in a different format, use -Fc and then use parallel restore. Even if half of the database is one table, you will still knock the restore time by 50% or so. Joshua D. Drake -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579 Consulting, Training, Support, Custom Development, Engineering
On Wed, Jul 14, 2010 at 01:07:48PM -0400, Burgholzer, Robert (DEQ) wrote: - That said, the time to restore is explainable, but is there something in - my tuning that is causing all of my memory to be eaten? - - We seem to have some undiagnosed issue whereby opening and closing large - files on the system leaves a lot in the cache -- I am guessing that this - is my culprit and NOT my pg tuning? - - Thanks again, - r.b. cat dumpfile | psql db_name means you're going to load as much of dumpfile into memory as you can, and then psql will read from memory and write to the DB. if your file is 40GB then it's going to use all of the available memory to your process (based on your ulimit, which is probably set to unlimited minus the OS default saved for root ~5%) try using pg_restore, psql -f (not sure of that would work, i don't do my restores that way) or use a named pipe and write/read from that. Dave - - Robert W. Burgholzer - Surface Water Modeler - Office of Water Supply and Planning - Virginia Department of Environmental Quality - rwburgholzer@deq.virginia.gov - 804-698-4405 - Open Source Modeling Tools: - http://sourceforge.net/projects/npsource/ - - -----Original Message----- - From: Joshua D. Drake [mailto:jd@commandprompt.com] - Sent: Wednesday, July 14, 2010 12:58 PM - To: Burgholzer, Robert (DEQ) - Cc: pgsql-admin@postgresql.org - Subject: Re: [ADMIN] proper tuning for restoring from pg_dump in 8.3.7 - - On Wed, 2010-07-14 at 12:33 -0400, Burgholzer, Robert (DEQ) wrote: - > I am restoring a fairly sizable database from a pg_dump file (COPY - FROM - > STDIN style of data) -- the pg_dump file is ~40G. - > - > My system has 4 cores, and 12G of RAM. I drop, then recreate the - > database, and I do this restore via a: cat dumpfile | psql db_name. - The - > trouble is that my system free memory (according to top) goes to about - > 60M, which causes all operations on the server to grind to a halt, and - > this 40G restore will take a couple hours to complete. - > - > I noted that the restore file doesn't do anything inappropriate such - as - > creating indices BEFORE adding the data or anything - thus I can only - > suspect that my trouble has to do with performance tuning ineptitude - in - > postgresql.conf. - - The best you will get is ~ 22G an hour. If this is a backup you can take - again in a different format, use -Fc and then use parallel restore. Even - if half of the database is one table, you will still knock the restore - time by 50% or so. - - Joshua D. Drake - - - - -- - PostgreSQL.org Major Contributor - Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579 - Consulting, Training, Support, Custom Development, Engineering - - - -- - Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org) - To make changes to your subscription: - http://www.postgresql.org/mailpref/pgsql-admin
"Burgholzer, Robert (DEQ)" <Robert.Burgholzer@deq.virginia.gov> wrote: > cat dumpfile | psql db_name Try: psql -1 -f dumpfile db_name > The trouble is that my system free memory (according to top) goes > to about 60M, What do you get from free or vmstat? (Oon't trust top too far on memory usage reporting.) > temp_buffers = 512MB You don't need that for a restore (and probably not at all -- read the docs on what it does). I don't think that one matters for a restore, but I would reduce it back to the default, just to be safe. > work_mem = 256MB That's probably too high, depending on the number of connections and your usage pattern. Again, probably not material for a restore. > maintenance_work_mem = 64MB This one matters -- it controls how much RAM is available to sort entries during an index build. On a single-threaded restore I would probably set that to 1GB to 2GB, and then reduce it later. > vacuum_cost_page_hit = 3 You normally don't want to adjust this one. vacuum_cost_delay and maybe vacuum_cost_limit are the useful knobs to turn in this area. You might want to consider: wal_buffers = 16MB Just for the restore you might want some settings you probably don't want in production. They can cause database corruption if there is a crash, but you can probably live with that during a restore -- you just reinitialize and try again. fsync = off synchronous_commit = off full_page_writes = off archive_mode = off Depending on your hardware, you might get a benefit from setting checkpoint_segments, checkpoint_timeout, checkpoint_completion_target, bgwriter_lru_maxpages and/or bgwriter_lru_multiplier higher. -Kevin
On Wed, 2010-07-14 at 13:07 -0400, Burgholzer, Robert (DEQ) wrote: > That said, the time to restore is explainable, but is there something in > my tuning that is causing all of my memory to be eaten? Please don't top post. > > We seem to have some undiagnosed issue whereby opening and closing large > files on the system leaves a lot in the cache -- I am guessing that this > is my culprit and NOT my pg tuning? > > Thanks again, > r.b. You mention the use of top. Assessing just the the reference to "free" memory isn't as accurate as you would think. I am going to assume you are using Linux. Linux will suck all (well not all but close) out of free and put it into things like buffers until free memory is actually needed. For example on my workstation: Mem: 6126116k total, 5910328k used, 215788k free, 495684k buffers Swap: 3919352k total, 5792k used, 3913560k free, 3178548k cached Which means exactly nothing, unless that SWAP number starts churning. Sincerely, Joshua D. Drake -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579 Consulting, Training, Support, Custom Development, Engineering
"Burgholzer, Robert (DEQ)" <Robert.Burgholzer@deq.virginia.gov> wrote: > We seem to have some undiagnosed issue whereby opening and closing > large files on the system leaves a lot in the cache Cached data is not a problem. Don't worry about that. -Kevin
"Burgholzer, Robert (DEQ)" <Robert.Burgholzer@deq.virginia.gov> writes: > That said, the time to restore is explainable, but is there something in > my tuning that is causing all of my memory to be eaten? That's normal behavior. A working Linux/Unix system *should* have near zero free memory. If it doesn't, either the filesystem cache is failing to do its job, or you recently booted the machine and it hasn't had a chance to fill the cache, or you bought way more RAM than your workload has any use for. > We seem to have some undiagnosed issue whereby opening and closing large > files on the system leaves a lot in the cache -- I am guessing that this > is my culprit and NOT my pg tuning? That's called "it's working correctly". If you want to get an accurate picture of whether the system is under memory pressure, you need to discount filesystem cache, and also pay attention to whether anything much has been pushed out to swap. regards, tom lane
OK, thanks to multiple folks for letting me know that I was looking at the wrong "top" metric. That said, my performance in most definitely suffering -- does this "swap" number seem excessive (looks like ~100 G to me): Swap: 102399992k total > Cached data is not a problem. Don't worry about that. As for my concerns about the cache'ing of files, we have found that we can reclaim our servers performance by doing the following: sync echo 1 > /proc/sys/vm/drop_caches But I am really squeamish about this - it just seems like something is wrong with this approach. The grinding halt will occur when doing either a large copy from PG, or from tests where we created then closed a huge number of largeish files. CentOS (regularly updated) is the Linux that we are running. Thanks also for the settings pointers, and the notion that I can do a restore with different settings than production. And also, I will give the alternatives to "cat" a whirl. Thanks to everyone, r.b.
"Burgholzer, Robert (DEQ)" <Robert.Burgholzer@deq.virginia.gov> wrote: > does this "swap" number seem excessive (looks like ~100 G to me): > Swap: 102399992k total That's probably how much disk space you have set aside for swapping. What matters is how much of that is *used*. For example: kgrittn@PLATO:/home/ccsa> free -m total used free shared buffers cached Mem: 64446 64144 301 0 46 59349 -/+ buffers/cache: 4748 59697 Swap: 1027 53 973 We've got 1GB available for swapping, and have 53MB swapped. That consists of some OS libraries we just don't use which eventually got swapped out and never found their way back to RAM for want of usage. (Which is fine with me, as it leaves 53MB more for caching.) >> Cached data is not a problem. Don't worry about that. > As for my concerns about the cache'ing of files, we have found > that we can reclaim our servers performance by doing the > following: > sync > echo 1 > /proc/sys/vm/drop_caches > > But I am really squeamish about this - it just seems like > something is wrong with this approach. Dropping caches is just going to cause unnecessary disk reads when you eventually try to re-read what were cached sectors, hurting performance. What, exactly, caused you to think it made something better? Might it have coincided with the completion of sync? By the way, what have you got for drives and controllers, and how are they arranged? One last tip: when your restore is complete, you might want to run VACUUM FREEZE ANALYZE in the database. Otherwise you will be rewriting rows to set hint bits as you read them, and at some point in the future PostgreSQL will start a VACUUM of all still-existent tuples from the load in order to FREEZE them before transaction ID wraparound. -Kevin
Burgholzer, Robert (DEQ) wrote: > OK, thanks to multiple folks for letting me know that I was looking at > the wrong "top" metric. That said, my performance in most definitely > suffering -- does this "swap" number seem excessive (looks like ~100 G > to me): > Swap: 102399992k total Total swap isn't an issue (except maybe in wasted disk - that *does* seem *very* high). It's the next column over (swap used) that you need to keep an eye on. -kgd
"Burgholzer, Robert (DEQ)" <Robert.Burgholzer@deq.virginia.gov> wrote: > my performance in most definitely suffering Exactly what are you seeing which causes you to say that? (Not in terms of what you think might be the *cause* of the performance problem, but what the *symptoms* are.) -Kevin
[Please be careful to keep the list copied.] "Burgholzer, Robert (DEQ)" <Robert.Burgholzer@deq.virginia.gov> wrote: >> Exactly what are you seeing which causes you to say that? > > An httpd request that is filled in fractions of a second now takes > 10-30 seconds to complete - this being a page that calls a php > script, and needs a database connection (even if it uses only a > minimal query). I think I'm starting to see the issue -- you have other databases in use for production on this system while you are loading this database, and *those* are the ones where you see the performance problems? > Using the following (on recommendation of Kevin/David): > psql -1 -f dumpfile db_name > > I now get the swap use, looks like there really is NOT much use: > total used free shared buffers > cached > Mem: 11874 11809 65 0 5 > 3746 > -/+ buffers/cache: 8057 3817 > Swap: 99999 70 99929 > > Just now, the psql approach of running the script has seemed to > ameliorate much of the performance degradataion -- except for > scripts that employ multiple database reads on another database > (that is not being reloaded) -- those are taking 2-3 times their > normal, but nothing like before. Maybe it WAS trying to load 40G > of file into memory then dumping into pg that was causing the > hassle. > > Thanks a bunch for everyone leading me through this process, I am > sure that I am far from knowing what is going on here, but at > least I learned a few tricks, If I'm understanding your real problem now, the psql -1 switch will help because it will allow the COPY statements to run without WAL-logging, which will cut not only total disk output, but the need to sync the cached data to disk. The options I suggested be turned off just for the load process will also help with that, although they are not safe if the other databases are in the same PostgreSQL cluster (i.e., different databases running in the same database service). If the database is on its own PostgreSQL cluster I would be very tempted to restore the database on a separate (but compatible) machine and then rsync it back with --bwlimit to limit the impact on the other database(s). If you still have problem, please post again, but you might get more useful advice if you provide more details about your environment and the actual problem you're trying to solve. The data in cache wasn't a problem to solve, and I suspect that the time to restore the database wasn't really the problem, either. It seems as the the actual problem was the impact of the restore on other databases running on the same server. I don't remember seeing some information which would be useful, like how many drives are configured in what RAID(s) using what controller(s), and what other databases are running in what postgresql clusters. -Kevin
Kevin, Thanks a ton for your responses (and forwarding to the list). As you gathered, I had two issues - long load time, performance hit on other databases on the same machine. I simply didn't conceptualize my problem properly, so I wasn't able to ask for the right help... so: > and *those* are the ones where you see the performance problems? Yes - most definitely (since I was told earlier that the load time was appropriate for the size of the restore) >. The options I suggested be turned off just for the load process > ... > not safe if the other databases are in the same PostgreSQL Cluster They are, so I won't --- thanks for the heads up on that too. Just for posterity, here is the systems specs (from my hardware guy): DEQ1 is on centos 5.3 W/ 1.5Tb striped system volumes using a pair of standard 7600 rpm sata drives. These are on system 3 gbit/s ports. Once again, thanks, r.b.
On Wed, Jul 14, 2010 at 05:30:17PM -0400, Burgholzer, Robert (DEQ) wrote: - >. The options I suggested be turned off just for the load process - > ... - > not safe if the other databases are in the same PostgreSQL Cluster - - They are, so I won't --- thanks for the heads up on that too. - - Just for posterity, here is the systems specs (from my hardware guy): - DEQ1 is on centos 5.3 - W/ 1.5Tb striped system volumes using a pair of standard 7600 rpm sata - drives. These are on system 3 gbit/s ports. So that makes total sense. you're flooding your disks/controller doing the massive amount of writing that's involved with the restore. Dave
On Wed, 2010-07-14 at 12:33 -0400, Burgholzer, Robert (DEQ) wrote: > I am restoring a fairly sizable database from a pg_dump file (COPY FROM > STDIN style of data) -- the pg_dump file is ~40G. > > My system has 4 cores, and 12G of RAM. I drop, then recreate the > database, and I do this restore via a: cat dumpfile | psql db_name. The > trouble is that my system free memory (according to top) goes to about > 60M, which causes all operations on the server to grind to a halt, and > this 40G restore will take a couple hours to complete. > > I noted that the restore file doesn't do anything inappropriate such as > creating indices BEFORE adding the data or anything - thus I can only > suspect that my trouble has to do with performance tuning ineptitude in > postgresql.conf. The best you will get is ~ 22G an hour. If this is a backup you can take again in a different format, use -Fc and then use parallel restore. Even if half of the database is one table, you will still knock the restore time by 50% or so. Joshua D. Drake -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579 Consulting, Training, Support, Custom Development, Engineering
On Wed, 2010-07-14 at 13:07 -0400, Burgholzer, Robert (DEQ) wrote: > That said, the time to restore is explainable, but is there something in > my tuning that is causing all of my memory to be eaten? Please don't top post. > > We seem to have some undiagnosed issue whereby opening and closing large > files on the system leaves a lot in the cache -- I am guessing that this > is my culprit and NOT my pg tuning? > > Thanks again, > r.b. You mention the use of top. Assessing just the the reference to "free" memory isn't as accurate as you would think. I am going to assume you are using Linux. Linux will suck all (well not all but close) out of free and put it into things like buffers until free memory is actually needed. For example on my workstation: Mem: 6126116k total, 5910328k used, 215788k free, 495684k buffers Swap: 3919352k total, 5792k used, 3913560k free, 3178548k cached Which means exactly nothing, unless that SWAP number starts churning. Sincerely, Joshua D. Drake -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579 Consulting, Training, Support, Custom Development, Engineering