Обсуждение: How to do fast, reliable backups?
Hi all, I am wondering how you guys back up your databases. Say, I have a 20 GB database, data and indexes. If I run pg_dump on this, it backs up the schema and the data. When I have to restore this, I whould have to run this through psql which would then re-build the indexes after it has inserted the records. That's not really a feasable option as it takes too long. If I back up the pgdata directory, I will get inconsistent data, as somebody might update data in 2 tables while my backup is running, one change in a table I have already backed up, one in a table I have not. The solution would be to shut the database server down, do the backup and then start everything back up. But that's not really an option in a 24/7 environment. What I'd like to see is a transaction aware backup which backs up indexes as well as data. Any ideas? Best regards, Chris
Alle 21:37, venerdì 5 marzo 2004, Chris Ruprecht ha scritto: > I am wondering how you guys back up your databases. Say, I have a 20 GB > database, data and indexes. If I run pg_dump on this, it backs up the > schema and the data. When I have to restore this, I whould have to run this > through psql which would then re-build the indexes after it has inserted > the records. That's not really a feasable option as it takes too long. When you have large databases and you need consistent backups, an interesting solution is master-slave replication. Just use a second server of your LAN as a slave replica of your main database server. Should you have any problem with the main server, just switch the IP addresses of the master and slave servers and you are up and running again. The switching takes a few milliseconds and can even be performed automatically by a few specific programs (look for "fault-tolerant linux systems" with google). While your backup server serves your users, you have all the time to install a new hard disk into the main server, restore your data (from the running slave server or from a tape) and come back to service. When you need to put your data into a tape or a CD-ROM, just stop the master-slave mechanism, make a "static" backup (a pg_dump or a file system copy) of your _slave_ server and restart the replication program. The master-slave replication mechanism will take care to re-align the two servers in a few minutes. This way, you never need to stop the main server. > If I back up the pgdata directory, I will get inconsistent data, as > somebody might update data in 2 tables while my backup is running, one > change in a table I have already backed up, one in a table I have not. The > solution would be to shut the database server down, do the backup and then > start everything back up. But that's not really an option in a 24/7 > environment. Making a copy of your data directory while your server is running is definitely a bad idea. Forget it. Backing up the file system used by a RDBMS usually requires to put the server off-line and flush all the unwritten data to the disk. This cannot be done in a 24/7 environment. > What I'd like to see is a transaction aware backup which backs up indexes > as well as data. Have a look at one of the master-slave replication systems available on the net. For this task you do not need anything as sophisticated as a multi-master replication system (Like the Bettina Kemme "PG-Replicator" project). You could even be able to write down your own client program with Perl o Python. The program just have to connect to your running master server, read the relevant data and insert them into the database on the slave server. Using a SERIAL primary key it is quite easy to spot which data of a table have not been copied to the slave server yet. Using a timer, you can define the time granularity of the replication mechanism that best fit your need. Using transactions you can be sure to get just consistent data. Reading and saving system tables, you can backup even the most hidden and subtle things of your DB. Beware that restoring your data is up to you, if you use a custom-made system. Test the restore system before using your program in a production environment. Good luck. ----------------------------------------- Alessandro Bottoni and Silvana Di Martino alessandrobottoni@interfree.it silvanadimartino@tin.it
On Sat, Mar 06, 2004 at 01:39:28PM +0000, Silvana Di Martino wrote: > a slave replica of your main database server. Should you have any problem > with the main server, just switch the IP addresses of the master and slave > servers and you are up and running again. The switching takes a few > milliseconds and can even be performed automatically by a few specific > programs (look for "fault-tolerant linux systems" with google). While your And what do you do about the records which made it to the master database but which haven't been applied on the slave yet? Asynchronous replication with automatic failover is _dangerous_, because you can strand records. Many commercial systems do this with what they describe as conflict-management; in truth, this usually amounts to marking _both_ records as suspect and letting the DBA decide what to do. I'm not saying that master-slave systems are a bad idea, but you'd better be aware of what you're exposing yourself to before considering such hot-failover cases. A -- Andrew Sullivan | ajs@crankycanuck.ca
Alle 16:30, sabato 6 marzo 2004, Andrew Sullivan ha scritto: > Asynchronous replication with automatic failover is _dangerous_, > because you can strand records. Many commercial systems do this with > what they describe as conflict-management; in truth, this usually > amounts to marking _both_ records as suspect and letting the DBA > decide what to do. Of course, that's right. Nevertheless, it is hard to imagine anything better than master-slave replication as a backup/recovery mechanism. For example, how more data would you loose during a file system copy? Much of the overall reliability depends on the details of the replication and of the failover mechanisms: - you need to copy the master db to the slave at a good speed and with a high frequency, in order to not leave behind any unsaved data. This mean to have good hardware and a good network. - the failover mechanism should be designed to flush all unsaved data before shutting down the master and replace it with the slave, if possible. This is often just matter to wait for the next step of replication. - some kind of database "journaling" can help. Write down any request of data change sent to the master before actually performing it. This should allow you to spot any unsaved operation at restore-time. - and so on.... Unfortunately, the only way to guarantee the system against any loss of data should be to use a federation of server with a federation-wide concept of transactions. That is: a transactions does not commit until all federated servers have performed the required change to data. Bettina Kemme's replication engine seems to evolve in this direction but, as long as I know, no PostgreSQL replication system actually implement this mechanism yet. See you. ----------------------------------------- Alessandro Bottoni and Silvana Di Martino alessandrobottoni@interfree.it silvanadimartino@tin.it
On Sun, Mar 07, 2004 at 08:54:07AM +0000, Silvana Di Martino wrote: > > Of course, that's right. Nevertheless, it is hard to imagine anything better > than master-slave replication as a backup/recovery mechanism. For example, > how more data would you loose during a file system copy? This actually depends on how the PITR mechanism is implemented. A common approach is to do filesystem backup followed by log shipping to multiple, redundant destinations. In this scenario, you only lose what isn't in the logs, which in the case of Postgres means you only lose uncommitted transactions. A -- Andrew Sullivan | ajs@crankycanuck.ca This work was visionary and imaginative, and goes to show that visionary and imaginative work need not end up well. --Dennis Ritchie
On Saturday, Mar 6, 2004, at 08:39 US/Eastern, Silvana Di Martino wrote: > > Making a copy of your data directory while your server is running is > definitely a bad idea. Forget it. Backing up the file system used by a > RDBMS > usually requires to put the server off-line and flush all the > unwritten data > to the disk. This cannot be done in a 24/7 environment. > >> What I'd like to see is a transaction aware backup which backs up >> indexes >> as well as data. > Well, at work we're using a commercial database called Progress. It has a very nice backup utility which creates a backup file as follows: All current database blocks are backed up as they are at the time the backup starts. Each DB block has a unique identifier. If a running transactions wants to write to a block which has not jet been backed up, the transaction is put on hold, the block is backed up and the transaction can then continue. If you restore a database which contains blocks of incomplete transactions, these transactions are backed out when you start the DB server (broker). The DB can also (not done by default) maintain an after image log which contains all transactions since the last backup. This log needs to be kept on a drive, different from the DB. In case the DB drive crashes, you can recover from your last backup and then roll forward the after image log. Is there any slight possibility, some mechanism can make it into Progresql at some stage (maybe version 8)? Best regards, Chris -- Chris Ruprecht Network Grunt and Bit Pusher extraordinaíre