Обсуждение: purpose/use of pg_start_backup() pg_stop_backup()
on the surface, the functions pg_start_backup() and pg_stop_backup() seem to indicate something very useful. as i understand it, i could do: psql -c "pg_start_backup('$(date +%Y%m%d%H%M%S)');" snapshot/dump $PGDATA psql -c "pg_stop_backup();" and this would then create a nice clean snapshot. however, i'm trying to work out what the restore process is. its not clear to me if one needs to do anything after restoring the dump. or, if there is some sort of incremental process that needs to be followed after the dump. can someone provide a walk-through of using these functions both for a backup and a restore? -- Jim Mercer Reptilian Research jim@reptiles.org +1 416 410-5633 "He who dies with the most toys is nonetheless dead"
On Apr 6, 2013, at 12:46 PM, Jim Mercer wrote:
can someone provide a walk-through of using these functions both for
a backup and a restore?
These functions aren't actually doing anything overly magical. They don't cause you to end up with a "nice clean snapshot", at least not right away. They're mostly just helping with the housekeeping tasks of making sure that in the future you have kept around the WAL segments that will be necessary to perform a replay of WAL segments to a consistent state. The pg_start_backup function is forcing a checkpoint in the database, while the pg_stop_backup function is executing an immediate WAL segment switch so that it can write the current one to the archive. The pg_start_backup function also writes a backup_label file to disk that can be used by a future system recovering from your file system archive to know from where to start replaying WAL segments to reach a consistent state. You want to wait for pg_start_backup to finish before you start making your archive and wait until it is finished before invoking pg_stop_backup (which itself will block until it is satisfied that WAL segments have been archived). Another piece to all this is putting a recovery.conf file in the unpacked directory that provides a restore_command that tells the system how to retrieve a WAL segment. Ordinarily when a PostgreSQL system experiences some kind of crash and has to regain a consistent state, it does so by replaying WAL segments from a safe "redo" location (relying on the properties of WAL segment application), and it retrieves these segments from pg_xlog. When you're recovering a system from an archive, however, you'll want to have created the archive without capturing the pg_xlog directory, and furthermore (obviously) time has marched on from the point that you did the archive, so necessary segments won't be there. Consequently you have to tell the system how to request WAL segments. And you'll need to have been previously putting these segments somewhere by configuring your master database to archive them by setting archive_mode and archive_command. The documentation is fairly complete, but there are a few places where it's critical to be careful, so give it a thorough read-through a few times. Also make sure that you regularly perform restore operations and have good tests to convince yourself that it all works. Meanwhile, I recently found myself in the position of wanting to perform base backups off of a PostgreSQL-8.4 server that was in warm standby so as to take that I/O strain off of the production server after having set up PITR. I was told "not possible", but with some deep dives into the documentation and some testing, I've ended up with procedures that work and are reasonably simple: * I invoke the pg_control shell command and extract the "redo" location * I copy a snapshot of ${PGDATA}/global/pg_control to ${PGDATA}/global/pg_control_snapshot_xxxxxx * I create a tarball of ${PGDATA} and name the archive in a way that includes the redo location's address * When I do an archive restore, I copy the snapshotted pg_control file back to where it should be; the recovery.conf fileis already conveniently there * When I am managing my WAL segment archive, I do so with a script that is invoked with an archive's name and that trimsonly WAL segments before the redo location I had to do this because you cannot run the pg_start_backup and pg_stop_backup commands on a warm standby. Hot standby is not an option until 9.x and I haven't yet had the luxury to make that jump. So, not magic... The two really crucial pieces of this are: * by capturing the redo location and snapshotting the pg_control file, at worst you make Postgres replay from WAL segmentsfurther back than was necessary * the pg_control file is 8K and written atomically; as I understand, this is necessary to ensure that PostgreSQL is recoverablefrom power loss or crashes For anyone else reading this thread, consider it a follow up for my query back in November... http://www.postgresql.org/message-id/20121129015217.GA9975@raptor.commandosoftware.com Repeated tests seem to indicate that the procedure is sound, but I'd love to hear if anyone thinks this is preposterous, since I've got a production system relying on it and I am aware that absence of proof does not constitute proof of absence. I initially got really confusing results when I tried to tell tar to capture pg_control first, and thought I had, but hadn't actually. The database _seemingly_ recovered to a consistent state, even to the point of accepting connections and having _some_ queries run correctly, but for certain SELECT queries the system would complain about missing pg_clog files, which at first had me scared because I found old posts about PostgreSQL having some bugs pertaining to the management of pg_clog with WAL, but I later realized my pg_control snapshotting failure, and I haven't since seen the pg_clog issue, which I hope means that I am in the clear. But, Jim, if you're on the 9.x series, or if you're taking your regular dumps from the master system, then you don't have to worry overly much about the latter part of this email, except in that it might help demystify some of what PostgreSQL is doing for you. -- AWG On Sat, Apr 06, 2013 at 12:46:05PM -0400, Jim Mercer wrote: > > on the surface, the functions pg_start_backup() and pg_stop_backup() seem to > indicate something very useful. > > as i understand it, i could do: > > psql -c "pg_start_backup('$(date +%Y%m%d%H%M%S)');" > snapshot/dump $PGDATA > psql -c "pg_stop_backup();" > > and this would then create a nice clean snapshot. > > however, i'm trying to work out what the restore process is. > > its not clear to me if one needs to do anything after restoring the dump. > > or, if there is some sort of incremental process that needs to be followed > after the dump. > > can someone provide a walk-through of using these functions both for > a backup and a restore? > > -- > Jim Mercer Reptilian Research jim@reptiles.org +1 416 410-5633 > "He who dies with the most toys is nonetheless dead" > > > -- > Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-admin