Обсуждение: Re: [HACKERS] Point in Time Recovery
PITR Patch v5_1 just posted has Point in Time Recovery working.... Still some rough edges....but we really need some testers now to give this a try and let me know what you think. Klaus Naumann and Mark Wong are the only [non-committers] to have tried to run the code (and let me know about it), so please have a look at [PATCHES] and try it out. Many thanks, Simon Riggs
Can you give us some suggestions of what kind of stuff to test? Is there a way we can artificially kill the backend in all sorts of nasty spots to see if recovery works? Does kill -9 simulate a 'power off'? Chris Simon Riggs wrote: > PITR Patch v5_1 just posted has Point in Time Recovery working.... > > Still some rough edges....but we really need some testers now to give > this a try and let me know what you think. > > Klaus Naumann and Mark Wong are the only [non-committers] to have tried > to run the code (and let me know about it), so please have a look at > [PATCHES] and try it out. > > Many thanks, > > Simon Riggs > > > ---------------------------(end of broadcast)--------------------------- > TIP 8: explain analyze is your friend
On Wed, 2004-07-14 at 03:31, Christopher Kings-Lynne wrote: > Can you give us some suggestions of what kind of stuff to test? Is > there a way we can artificially kill the backend in all sorts of nasty > spots to see if recovery works? Does kill -9 simulate a 'power off'? > I was hoping some fiendish plans would be presented to me... But please start with "this feels like typical usage" and we'll go from there...the important thing is to try the first one. I've not done power off tests, yet. They need to be done just to check...actually you don't need to do this to test PITR... We need to exhaustive tests of... - power off - scp and cross network copies - all the permuted recovery options - archive_mode = off (i.e. current behaviour) - deliberately incorrectly set options (idiot-proof testing) I'd love some help assembling a test document with numbered tests... Best regards, Simon Riggs
Simon Riggs <simon@2ndquadrant.com> writes: > I've not done power off tests, yet. They need to be done just to > check...actually you don't need to do this to test PITR... I agree, power off is not really the point here. What we need to check into is (a) the mechanics of archiving WAL segments and (b) the process of restoring given a backup and a bunch of WAL segments. regards, tom lane
On 14 Jul, Simon Riggs wrote: > PITR Patch v5_1 just posted has Point in Time Recovery working.... > > Still some rough edges....but we really need some testers now to give > this a try and let me know what you think. > > Klaus Naumann and Mark Wong are the only [non-committers] to have tried > to run the code (and let me know about it), so please have a look at > [PATCHES] and try it out. > > Many thanks, > > Simon Riggs Simon, I just tried applying the v5_1 patch against the cvs tip today and got a couple of rejections. I'll copy the patch output here. Let me know if you want to see the reject files or anything else: $ patch -p0 < ../../../pitr-v5_1.diff patching file backend/access/nbtree/nbtsort.c Hunk #2 FAILED at 221. 1 out of 2 hunks FAILED -- saving rejects to file backend/access/nbtree/nbtsort.c.rej patching file backend/access/transam/xlog.c Hunk #11 FAILED at 1802. Hunk #15 FAILED at 2152. Hunk #16 FAILED at 2202. Hunk #21 FAILED at 3450. Hunk #23 FAILED at 3539. Hunk #25 FAILED at 3582. Hunk #26 FAILED at 3833. Hunk #27 succeeded at 3883 with fuzz 2. Hunk #28 FAILED at 4446. Hunk #29 succeeded at 4470 with fuzz 2. 8 out of 29 hunks FAILED -- saving rejects to file backend/access/transam/xlog.c.rej patching file backend/postmaster/Makefile patching file backend/postmaster/postmaster.c Hunk #3 succeeded at 1218 with fuzz 2 (offset 70 lines). Hunk #4 succeeded at 1827 (offset 70 lines). Hunk #5 succeeded at 1874 (offset 70 lines). Hunk #6 succeeded at 1894 (offset 70 lines). Hunk #7 FAILED at 1985. Hunk #8 succeeded at 2039 (offset 70 lines). Hunk #9 succeeded at 2236 (offset 70 lines). Hunk #10 succeeded at 2996 with fuzz 2 (offset 70 lines). 1 out of 10 hunks FAILED -- saving rejects to file backend/postmaster/postmaster.c.rej patching file backend/storage/smgr/md.c Hunk #1 succeeded at 162 with fuzz 2. patching file backend/utils/misc/guc.c Hunk #1 succeeded at 342 (offset 9 lines). Hunk #2 succeeded at 1387 (offset 9 lines). patching file backend/utils/misc/postgresql.conf.sample Hunk #1 succeeded at 113 (offset 10 lines). patching file bin/initdb/initdb.c patching file include/access/xlog.h patching file include/storage/pmsignal.h
On Wed, 2004-07-14 at 16:55, markw@osdl.org wrote: > On 14 Jul, Simon Riggs wrote: > > PITR Patch v5_1 just posted has Point in Time Recovery working.... > > > > Still some rough edges....but we really need some testers now to give > > this a try and let me know what you think. > > > > Klaus Naumann and Mark Wong are the only [non-committers] to have tried > > to run the code (and let me know about it), so please have a look at > > [PATCHES] and try it out. > > > I just tried applying the v5_1 patch against the cvs tip today and got a > couple of rejections. I'll copy the patch output here. Let me know if > you want to see the reject files or anything else: > I'm on it. Sorry 'bout that all - midnight fingers.
I noticed that compiling with 5_1 patch applied fails due to XLOG_archive_dir being removed from xlog.c , but src/backend/commands/tablecmds.c still uses it. I did the following to tablecmds.c : 5408c5408 < extern char XLOG_archive_dir[]; --- > extern char *XLogArchiveDest; 5410c5410 < use_wal = XLOG_archive_dir[0] && !rel->rd_istemp; --- > use_wal = XLogArchiveDest[0] && !rel->rd_istemp; Now I have to see if I have broken it with this change :-) regards Mark Simon Riggs wrote: >On Wed, 2004-07-14 at 16:55, markw@osdl.org wrote: > > >>On 14 Jul, Simon Riggs wrote: >> >> >>>PITR Patch v5_1 just posted has Point in Time Recovery working.... >>> >>>Still some rough edges....but we really need some testers now to give >>>this a try and let me know what you think. >>> >>>Klaus Naumann and Mark Wong are the only [non-committers] to have tried >>>to run the code (and let me know about it), so please have a look at >>>[PATCHES] and try it out. >>> >>> >>> > > > >>I just tried applying the v5_1 patch against the cvs tip today and got a >>couple of rejections. I'll copy the patch output here. Let me know if >>you want to see the reject files or anything else: >> >> >> > >I'm on it. Sorry 'bout that all - midnight fingers. > > >---------------------------(end of broadcast)--------------------------- >TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faqs/FAQ.html > >
On Thu, 2004-07-15 at 02:43, Mark Kirkwood wrote: > I noticed that compiling with 5_1 patch applied fails due to > XLOG_archive_dir being removed from xlog.c , but > src/backend/commands/tablecmds.c still uses it. > > I did the following to tablecmds.c : > > 5408c5408 > < extern char XLOG_archive_dir[]; > --- > > extern char *XLogArchiveDest; > 5410c5410 > < use_wal = XLOG_archive_dir[0] && !rel->rd_istemp; > --- > > use_wal = XLogArchiveDest[0] && !rel->rd_istemp; > > Yes, I discovered that myself. The fix is included in pitr_v5_2.patch... Your patch follows the right thinking and looks like it would have worked... - XLogArchiveMode carries the main bool value for mode on/off - XLogArchiveDest might also be used, though best to use the mode Thanks for looking through the code... Best Regards, Simon Riggs
Simon Riggs wrote: > On Wed, 2004-07-14 at 03:31, Christopher Kings-Lynne wrote: > >>Can you give us some suggestions of what kind of stuff to test? Is >>there a way we can artificially kill the backend in all sorts of nasty >>spots to see if recovery works? Does kill -9 simulate a 'power off'? >> > > > I was hoping some fiendish plans would be presented to me... > > But please start with "this feels like typical usage" and we'll go from > there...the important thing is to try the first one. > > I've not done power off tests, yet. They need to be done just to > check...actually you don't need to do this to test PITR... > > We need to exhaustive tests of... > - power off > - scp and cross network copies > - all the permuted recovery options > - archive_mode = off (i.e. current behaviour) > - deliberately incorrectly set options (idiot-proof testing) If you write also how to perform these tests it's also good in order to show which problem PITR is addressing, I mean I know that is addressing a power off but how I will recover it ? Regards Gaetano Mendola
Here is one for the 'idiot proof' category: 1) initdb and set archive_command 2) shutdown 3) do a backup 4) startup and run some transactions 5) shutdown and remove PGDATA 6) restore backup 7) startup Obviously this does not work as the backup is performed with the database shutdown. This got me wondering for 2 reasons: 1) Some alternative database servers *require* a procedure like this to enable their version of PITR - so the potential foot-gun thing is there. 2) Is is possible to make the recovery kick in even though pg_control says the database state is shutdown? Simon Riggs wrote: > >I was hoping some fiendish plans would be presented to me... > >But please start with "this feels like typical usage" and we'll go from >there...the important thing is to try the first one. > >I've not done power off tests, yet. They need to be done just to >check...actually you don't need to do this to test PITR... > >We need to exhaustive tests of... >- power off >- scp and cross network copies >- all the permuted recovery options >- archive_mode = off (i.e. current behaviour) >- deliberately incorrectly set options (idiot-proof testing) > > > >
Mark Kirkwood <markir@coretech.co.nz> writes: > Here is one for the 'idiot proof' category: > 1) initdb and set archive_command > 2) shutdown > 3) do a backup > 4) startup and run some transactions > 5) shutdown and remove PGDATA > 6) restore backup > 7) startup > Obviously this does not work as the backup is performed with the > database shutdown. Huh? It works fine. The bit you may be missing is that if you blow away $PGDATA including pg_xlog/, you won't be able to recover past whatever you have in your WAL archive area. The archive is certainly not going to include the current partially-filled WAL segment, and it might be missing a few earlier segments if the archival process isn't speedy. So you need to keep those recent segments in pg_xlog/ if you want to recover to current time or near-current time. I'm becoming more and more convinced that we should bite the bullet and move pg_xlog/ to someplace that is not under $PGDATA. It would just make things a whole lot more reliable, both for backup and to deal with scenarios like yours above. I tried to talk Bruce into this on the phone the other day, but he wouldn't bite. I still think it's a good idea though. It would (1) eliminate the problem that a tar backup of $PGDATA would restore stale copies of xlog segments, because the tar wouldn't include pg_xlog in the first place. (2) eliminate the problem that a naive "rm -rf $PGDATA" would blow away xlog segments that you still need. A possible compromise is that we should strongly suggest that pg_xlog be pushed out to another place and symlinked if you are going to use WAL archiving. That's already considered good practice for performance if you have a separate disk spindle to put WAL on. It'll just have to be good practive for WAL archiving too. regards, tom lane
I think we should push the partially complete WAL file to the archive location before shutdown. I talked to you or Jan about it and you (or Jan) wouldn't bite either, but I think when someone shuts down, they assume they have things fully archived and can recover fully with a previous backup and the archive files. When you are running and finally fill up the WAL file it would then overwrite the one in the archive but I think that is OK. Maybe we would need to give it a special file extension so we only use it when we don't have a full version. --------------------------------------------------------------------------- Tom Lane wrote: > Mark Kirkwood <markir@coretech.co.nz> writes: > > Here is one for the 'idiot proof' category: > > 1) initdb and set archive_command > > 2) shutdown > > 3) do a backup > > 4) startup and run some transactions > > 5) shutdown and remove PGDATA > > 6) restore backup > > 7) startup > > > Obviously this does not work as the backup is performed with the > > database shutdown. > > Huh? It works fine. > > The bit you may be missing is that if you blow away $PGDATA including > pg_xlog/, you won't be able to recover past whatever you have in your WAL > archive area. The archive is certainly not going to include the current > partially-filled WAL segment, and it might be missing a few earlier > segments if the archival process isn't speedy. So you need to keep > those recent segments in pg_xlog/ if you want to recover to current time > or near-current time. > > I'm becoming more and more convinced that we should bite the bullet and > move pg_xlog/ to someplace that is not under $PGDATA. It would just > make things a whole lot more reliable, both for backup and to deal with > scenarios like yours above. I tried to talk Bruce into this on the > phone the other day, but he wouldn't bite. I still think it's a good > idea though. It would > (1) eliminate the problem that a tar backup of $PGDATA would restore > stale copies of xlog segments, because the tar wouldn't include > pg_xlog in the first place. > (2) eliminate the problem that a naive "rm -rf $PGDATA" would blow away > xlog segments that you still need. > > A possible compromise is that we should strongly suggest that pg_xlog > be pushed out to another place and symlinked if you are going to use > WAL archiving. That's already considered good practice for performance > if you have a separate disk spindle to put WAL on. It'll just have > to be good practive for WAL archiving too. > > regards, tom lane > > ---------------------------(end of broadcast)--------------------------- > TIP 7: don't forget to increase your free space map settings > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Well that is interesting :_) Here is what I am doing on the removal front (I am keeping pg_xlog *now*): $ cd $PGDATA $ pg_ctl stop $ ls|grep -v pg_xlog|xargs rm -rf The contents of the archive directory just before recovery starts: $ ls -l $PGDATA/../7.5-archive total 49212 -rw------- 1 postgres postgres 16777216 Jul 22 14:59 000000010000000000000000 -rw------- 1 postgres postgres 16777216 Jul 22 14:59 000000010000000000000001 -rw------- 1 postgres postgres 16777216 Jul 22 14:59 000000010000000000000002 But here is recovery startup log: LOG: database system was shut down at 2004-07-22 14:58:57 NZST LOG: starting archive recovery LOG: restore_command = "cp /data1/pgdata/7.5-archive/%f %p" cp: cannot stat `/data1/pgdata/7.5-archive/00000001.history': No such file or directory LOG: restored log file "000000010000000000000000" from archive LOG: checkpoint record is at 0/A4D3E8 LOG: redo record is at 0/A4D3E8; undo record is at 0/0; shutdown TRUE LOG: next transaction ID: 496; next OID: 17229 LOG: archive recovery complete LOG: database system is ready regards Mark Tom Lane wrote: > >Huh? It works fine. > >The bit you may be missing is that if you blow away $PGDATA including >pg_xlog/, you won't be able to recover past whatever you have in your WAL >archive area. The archive is certainly not going to include the current >partially-filled WAL segment, and it might be missing a few earlier >segments if the archival process isn't speedy. So you need to keep >those recent segments in pg_xlog/ if you want to recover to current time >or near-current time. > > > >
Bruce Momjian <pgman@candle.pha.pa.us> writes: > I think we should push the partially complete WAL file to the archive > location before shutdown. ... > When you are running and finally fill up the WAL file it would then > overwrite the one in the archive but I think that is OK. I don't think this can fly at all. Here are some off-the-top-of-the-head objections: 1. We don't have the luxury of spending indefinite amounts of time to do a database shutdown. Commonly we are under a twenty-second sentence of death from init. I don't want to spend the 20 seconds waiting to see if the archiver will manage to push 16MB onto a slow tape drive. Also, if the archiver does fail to push the data in time, it'll likely leave a broken (partial) xlog file in the archive, which would be really bad news if the user then relies on that. 2. What if the archiver process entirely fails to push the file? (Maybe there's not enough disk space, for instance.) In normal operation we'll just retry every so often. We definitely can't do that during shutdown. 3. You're blithely assuming that the archival process can easily provide overwrite semantics for multiple pushes of the same xlog filename. Stop thinking about "cp to some directory" and start thinking "dump to tape" or "burn onto CD" or something like that. We'll be raising the ante considerably if we require the archive_command to deal with this. I think the last one is really the most significant issue. We have to keep the archiver API as simple as possible. regards, tom lane
Agreed, it might not be possible, but your report does point out a limitation in our implementation --- that a shutdown database contains more information than a backup and the archive logs. That is not intuitive. In fact, if you shutdown your database and want to reproduce it on another machine, how do you do it? Seems you have to copy pg_xlog directory over to the new machine. In fact, moving pg_xlog to a new location doesn't make that clear either. Seems documentation might be the only way to make this clear. One idea would be to just push the partial WAL file to the archive on server shutdown and not reuse it and start with a new WAL file on startup. At least for a normal system shutdown this will give us an archive that contains all the information that is in pg_xlog. --------------------------------------------------------------------------- Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > I think we should push the partially complete WAL file to the archive > > location before shutdown. ... > > When you are running and finally fill up the WAL file it would then > > overwrite the one in the archive but I think that is OK. > > I don't think this can fly at all. Here are some off-the-top-of-the-head > objections: > > 1. We don't have the luxury of spending indefinite amounts of time to > do a database shutdown. Commonly we are under a twenty-second sentence > of death from init. I don't want to spend the 20 seconds waiting to see > if the archiver will manage to push 16MB onto a slow tape drive. Also, > if the archiver does fail to push the data in time, it'll likely leave a > broken (partial) xlog file in the archive, which would be really bad > news if the user then relies on that. > > 2. What if the archiver process entirely fails to push the file? (Maybe > there's not enough disk space, for instance.) In normal operation we'll > just retry every so often. We definitely can't do that during shutdown. > > 3. You're blithely assuming that the archival process can easily provide > overwrite semantics for multiple pushes of the same xlog filename. Stop > thinking about "cp to some directory" and start thinking "dump to tape" > or "burn onto CD" or something like that. We'll be raising the ante > considerably if we require the archive_command to deal with this. > > I think the last one is really the most significant issue. We have to > keep the archiver API as simple as possible. > > regards, tom lane > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Bruce Momjian <pgman@candle.pha.pa.us> writes: > Agreed, it might not be possible, but your report does point out a > limitation in our implementation --- that a shutdown database contains > more information than a backup and the archive logs. That is not > intuitive. That's only because you are clinging to the broken assumption that pg_xlog/ is part of the database, rather than part of the logs. Separate that out as a distinct entity, and all gets better. regards, tom lane
Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > Agreed, it might not be possible, but your report does point out a > > limitation in our implementation --- that a shutdown database contains > > more information than a backup and the archive logs. That is not > > intuitive. > > That's only because you are clinging to the broken assumption that > pg_xlog/ is part of the database, rather than part of the logs. > Separate that out as a distinct entity, and all gets better. Imagine this. I stop the server. I have a tar backup and a copy of the archive. I should be able to take them to another machine and recover the system to the point I stopped. You are saying I need a copy of pg_xlog directory too, and I need to remove pg_xlog after I untar the data directory and put the saved pg_xlog into there before I recover. Should we create a server-side function that forces all WAL files to the archive, including partially written ones. Maybe that fixes the problem with people deleting pg_xlog before they untar. You tell them to run the function before recovery. If the system can't be started, the it is possible the WAL files are no good too, not sure. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
On Thu, 2004-07-22 at 04:29, Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > I think we should push the partially complete WAL file to the archive > > location before shutdown. ... > > When you are running and finally fill up the WAL file it would then > > overwrite the one in the archive but I think that is OK. > > I don't think this can fly at all. Here are some off-the-top-of-the-head > objections: > > 1. We don't have the luxury of spending indefinite amounts of time to > do a database shutdown. Commonly we are under a twenty-second sentence > of death from init. I don't want to spend the 20 seconds waiting to see > if the archiver will manage to push 16MB onto a slow tape drive. Also, > if the archiver does fail to push the data in time, it'll likely leave a > broken (partial) xlog file in the archive, which would be really bad > news if the user then relies on that. > > 2. What if the archiver process entirely fails to push the file? (Maybe > there's not enough disk space, for instance.) In normal operation we'll > just retry every so often. We definitely can't do that during shutdown. > > 3. You're blithely assuming that the archival process can easily provide > overwrite semantics for multiple pushes of the same xlog filename. Stop > thinking about "cp to some directory" and start thinking "dump to tape" > or "burn onto CD" or something like that. We'll be raising the ante > considerably if we require the archive_command to deal with this. > > I think the last one is really the most significant issue. We have to > keep the archiver API as simple as possible. > Not read whole chain of conversation...but this idea came up before and was rejected then. I agree with the 3 objections to that thought above. There's already enough copies of full xlogs around to worry about. If you need more granularity, reduce size of xlog files.... (Tom, SUID would be the correct timeline id in that situation? ) More later, Simon Riggs
Mark Kirkwood <markir@coretech.co.nz> writes: > 2) Is is possible to make the recovery kick in even though pg_control > says the database state is shutdown? Yeah, I think you are right: presence of recovery.conf should force a WAL scan even if pg_control claims it's shut down. Fix committed. regards, tom lane
Excellent - Just updated and it is all good! This change makes the whole "how do I do my backup" business nice and basic - which the right way IMHO. regards Mark Tom Lane wrote: >Mark Kirkwood <markir@coretech.co.nz> writes: > > >>2) Is is possible to make the recovery kick in even though pg_control >>says the database state is shutdown? >> >> > >Yeah, I think you are right: presence of recovery.conf should force a >WAL scan even if pg_control claims it's shut down. Fix committed. > > regards, tom lane > >
On Thu, 2004-07-22 at 21:19, Tom Lane wrote: > Mark Kirkwood <markir@coretech.co.nz> writes: > > 2) Is is possible to make the recovery kick in even though pg_control > > says the database state is shutdown? > > Yeah, I think you are right: presence of recovery.conf should force a > WAL scan even if pg_control claims it's shut down. Fix committed. > This *should* be possible but I haven't tested it. There is a code path on secondary checkpoints that indicates that crash recovery can occur even when the database was shutdown, since the code forces recovery whether it was or not. On that basis, this may work, but is yet untested. I didn't mention this because it might interfere with getting hot backup to work... Best Regards, Simon Riggs
I have tested the "cold" backup - and retested my previous scenarios using "hot" backup (just to be sure) . They all work AFAICS! cheers Mark Simon Riggs wrote: >On Thu, 2004-07-22 at 21:19, Tom Lane wrote: > > >>Mark Kirkwood <markir@coretech.co.nz> writes: >> >> >>>2) Is is possible to make the recovery kick in even though pg_control >>>says the database state is shutdown? >>> >>> >>Yeah, I think you are right: presence of recovery.conf should force a >>WAL scan even if pg_control claims it's shut down. Fix committed. >> >> >> > >This *should* be possible but I haven't tested it. > >There is a code path on secondary checkpoints that indicates that crash >recovery can occur even when the database was shutdown, since the code >forces recovery whether it was or not. On that basis, this may work, but >is yet untested. I didn't mention this because it might interfere with >getting hot backup to work... > >Best Regards, Simon Riggs > > >
Simon Riggs <simon@2ndquadrant.com> writes: > On Thu, 2004-07-22 at 21:19, Tom Lane wrote: >> Yeah, I think you are right: presence of recovery.conf should force a >> WAL scan even if pg_control claims it's shut down. Fix committed. > This *should* be possible but I haven't tested it. I did. It's really not risky. The fact that the code doesn't look beyond the checkpoint record when things seem to be kosher is just a speed optimization (and probably a rather pointless one...) We have got to be able to detect the end of WAL in any case, so we'd just find there are no more records and stop. regards, tom lane
On Fri, 2004-07-23 at 01:05, Mark Kirkwood wrote: > I have tested the "cold" backup - and retested my previous scenarios > using "hot" backup (just to be sure) . They all work AFAICS! > cheers Yes, I'll drink to that! Thanks for your help. Best Regards, Simon Riggs
Here is another open PITR issue that I think will have to be addressed in 7.6. If you do a critical transaction, but do nothing else for eight hours, that critical transaction hasn't been archived yet. It is still sitting in pg_xlog until the WAL file fills. I think we will need to document this behavior and address it in some way in 7.6. We can't assume that we can send multiple copies of pg_xlog to the archive (partial and full ones) because we might be going to a tape drive. However, this is a non-intuitive behavior of our archiver. We might need to tell people to archive the most recent WAL file every minute to some other location or something. --------------------------------------------------------------------------- Tom Lane wrote: > Bruce Momjian <pgman@candle.pha.pa.us> writes: > > I think we should push the partially complete WAL file to the archive > > location before shutdown. ... > > When you are running and finally fill up the WAL file it would then > > overwrite the one in the archive but I think that is OK. > > I don't think this can fly at all. Here are some off-the-top-of-the-head > objections: > > 1. We don't have the luxury of spending indefinite amounts of time to > do a database shutdown. Commonly we are under a twenty-second sentence > of death from init. I don't want to spend the 20 seconds waiting to see > if the archiver will manage to push 16MB onto a slow tape drive. Also, > if the archiver does fail to push the data in time, it'll likely leave a > broken (partial) xlog file in the archive, which would be really bad > news if the user then relies on that. > > 2. What if the archiver process entirely fails to push the file? (Maybe > there's not enough disk space, for instance.) In normal operation we'll > just retry every so often. We definitely can't do that during shutdown. > > 3. You're blithely assuming that the archival process can easily provide > overwrite semantics for multiple pushes of the same xlog filename. Stop > thinking about "cp to some directory" and start thinking "dump to tape" > or "burn onto CD" or something like that. We'll be raising the ante > considerably if we require the archive_command to deal with this. > > I think the last one is really the most significant issue. We have to > keep the archiver API as simple as possible. > > regards, tom lane > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
On 2004-07-28, Bruce Momjian <pgman@candle.pha.pa.us> wrote: > > Here is another open PITR issue that I think will have to be addressed > in 7.6. If you do a critical transaction, but do nothing else for eight > hours, that critical transaction hasn't been archived yet. It is still > sitting in pg_xlog until the WAL file fills. > > I think we will need to document this behavior and address it in some > way in 7.6. We can't assume that we can send multiple copies of pg_xlog > to the archive (partial and full ones) because we might be going to a If a particular transaction is so important that it absolutely positively needs to be archived offline for PITR, then why not just mark it that way or allow for the application to trigger archival of this critical REDO? > tape drive. However, this is a non-intuitive behavior of our archiver. > We might need to tell people to archive the most recent WAL file every > minute to some other location or something. [deletia] -- Negligence will never equal intent, no matter how you attempt to distort reality to do so. This is what separates ||| the real butchers from average Joes (or Fritzes) caught up in / | \ events not in their control.