Re: Verified fix for Bug 4137
От | Simon Riggs |
---|---|
Тема | Re: Verified fix for Bug 4137 |
Дата | |
Msg-id | 1210072983.4435.263.camel@ebony.site обсуждение исходный текст |
Ответ на | Re: Verified fix for Bug 4137 ("Heikki Linnakangas" <heikki@enterprisedb.com>) |
Список | pgsql-patches |
On Tue, 2008-05-06 at 12:02 +0100, Heikki Linnakangas wrote: > Simon Riggs wrote: > > The problem was that at the very start of archive recovery the %r > > parameter in restore_command could be set to a filename later than the > > currently requested filename (%f). This could lead to early truncation > > of the archived WAL files and would cause warm standby replication to > > fail soon afterwards, in certain specific circumstances. > > > > Fix applied to both core server in generating correct %r filenames and > > also to pg_standby to prevent acceptance of out-of-sequence filenames. > > So the core problem is that we use ControlFile->checkPointCopy.redo in > RestoreArchivedFile to determine the safe truncation point, but when > there's a backup label file, that's still coming from pg_control file, > which is wrong. > > The patch fixes that by determining the safe truncation point as > Min(checkPointCopy.redo, xlogfname), where xlogfname is the xlog file > being restored. That depends on the assumption that everything before > the first file we (try to) restore is safe to truncate. IOW, we never > try to restore file B first, and then A, where A < B. > > I'm not totally convinced that's a safe assumption. As an example, > consider doing an archive recovery, but without a backup label, and the > latest checkpoint record is broken. We would try to read the latest > (broken) checkpoint record first, and call RestoreArchivedFile to get > the xlog file containing that. But because that record is broken, we > fall back to using the previous checkpoint, and will need the xlog file > where the previous checkpoint record is in. > > That's a pretty contrived example, but the point is that assumption > feels fragile. At the very least it should be noted explicitly in the > comments. A less fragile approach would be to use something dummy, like > "000000000000000000000000" as the truncation point until we've > successfully read the checkpoint/restartpoint record and started the replay. Your comments are interesting and I'll think through that some more to see if I can see if that leads to bugs. Will talk more later. The only valid starting place, in all cases, is the checkpoint written by pg_start_backup(). The user doesn't need to have been archiving files prior to that so could legitimately have had a null archive_command. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com
В списке pgsql-patches по дате отправления: