Обсуждение: pg_start_backup without checkpoint patch (a part of Synch Rep)
Hi, Attached is the self-contained patch to skip checkpoint at pg_start_backup. This is a part of Synch Rep patches, and was discussed in the following thread. http://archives.postgresql.org/message-id/3f0b79eb0812240710j7e613f3atfd6b6fc27403546e@mail.gmail.com In Synch Rep, we basically have to get a fresh backup, to make the failed server catch up with the primary after failover. But getting an online-backup is expensive operation because of pg_start_backup's checkpoint and data copying. Especially since pg_start_backup is performed as one transaction, its checkpoint might not only increase I/O traffic but also cause long-transaction which prevents VACUUM and HOT. This patch makes pg_start_backup skip a checkpoint if possible, and works out the above problem. Specifically, pg_start_backup uses the last checkpoint instead of doing a new checkpoint if full_page_writes = on since the last checkpoint, which guarantees that all the full-pages required for PITR are written. There is one constraint: XLR_BKP_REMOVABLE flag cannot be marked correctly, because we cannot judge whether the full-pages (generated before pg_start_backup) are removal or not. AFAIK, this flag is only used by pglesslog. Should we add a new option to specify whether checkpoint is skippable, for the backward compatibility? Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Вложения
Fujii Masao wrote: > Attached is the self-contained patch to skip checkpoint at pg_start_backup. > This is a part of Synch Rep patches, and was discussed in the following > thread. > http://archives.postgresql.org/message-id/3f0b79eb0812240710j7e613f3atfd6b6fc27403546e@mail.gmail.com I'm not convinced that this is necessary for the replication patch. It is an orthogonal, new feature, and should be considered for 8.5, IMHO. > Specifically, pg_start_backup uses the last checkpoint instead of doing a > new checkpoint if full_page_writes = on since the last checkpoint, which > guarantees that all the full-pages required for PITR are written. That assumes that the DBA has kept all the WAL segments that have been archived since last checkpoint. So this would no longer be safe: 1. rm <archivedir>/* 2. SELECT pg_start_backup(); 3. tar cvzf backup.tar.gz <datadir> 4. SELECT pg_stop_backup(); -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
Hi, On Mon, Dec 29, 2008 at 6:08 PM, Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> wrote: > Fujii Masao wrote: >> >> Attached is the self-contained patch to skip checkpoint at >> pg_start_backup. >> This is a part of Synch Rep patches, and was discussed in the following >> thread. >> >> http://archives.postgresql.org/message-id/3f0b79eb0812240710j7e613f3atfd6b6fc27403546e@mail.gmail.com > > I'm not convinced that this is necessary for the replication patch. It is an > orthogonal, new feature, and should be considered for 8.5, IMHO. Synch Rep forces online-backup in some scenes (e.g. catchup after failover), so I've argued from the beginning that the cost of it should be reduced. I think this patch is one of the solutions to that problem. > >> Specifically, pg_start_backup uses the last checkpoint instead of doing a >> new checkpoint if full_page_writes = on since the last checkpoint, which >> guarantees that all the full-pages required for PITR are written. > > That assumes that the DBA has kept all the WAL segments that have been > archived since last checkpoint. So this would no longer be safe: > > 1. rm <archivedir>/* > 2. SELECT pg_start_backup(); > 3. tar cvzf backup.tar.gz <datadir> > 4. SELECT pg_stop_backup(); Umm... I can't believe that there is the DBA to carry out such unsafe operation. If disk crash occurs between 1 and 2, a database might not recover *regardless of this patch* because of some missing xlogs. I don't think that the patch itself is unsafe. In this patch, the DBA can still judge safely which file can be removed, from a backup history file or the return value of pg_start_backup. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center
Fujii Masao wrote: > On Mon, Dec 29, 2008 at 6:08 PM, Heikki Linnakangas > <heikki.linnakangas@enterprisedb.com> wrote: >> Fujii Masao wrote: >>> Attached is the self-contained patch to skip checkpoint at >>> pg_start_backup. >>> This is a part of Synch Rep patches, and was discussed in the following >>> thread. >>> >>> http://archives.postgresql.org/message-id/3f0b79eb0812240710j7e613f3atfd6b6fc27403546e@mail.gmail.com >> I'm not convinced that this is necessary for the replication patch. It is an >> orthogonal, new feature, and should be considered for 8.5, IMHO. > > Synch Rep forces online-backup in some scenes (e.g. catchup after failover), > so I've argued from the beginning that the cost of it should be reduced. Sure, but it's just an optimization. Synch rep still works fine without it. >>> Specifically, pg_start_backup uses the last checkpoint instead of doing a >>> new checkpoint if full_page_writes = on since the last checkpoint, which >>> guarantees that all the full-pages required for PITR are written. >> That assumes that the DBA has kept all the WAL segments that have been >> archived since last checkpoint. So this would no longer be safe: >> >> 1. rm <archivedir>/* >> 2. SELECT pg_start_backup(); >> 3. tar cvzf backup.tar.gz <datadir> >> 4. SELECT pg_stop_backup(); > > Umm... I can't believe that there is the DBA to carry out such unsafe > operation. If disk crash occurs between 1 and 2, a database might > not recover *regardless of this patch* because of some missing xlogs. I use the above rm+start+tar+stop procedure all the time, when testing PITR. You might be taking a one-off online backup, in which case you don't care about the history. See section 24.3.5.1 on "Standalone hot backups": http://www.postgresql.org/docs/8.3/static/continuous-archiving.html#BACKUP-BASE-BACKUP > I don't think that the patch itself is unsafe. In this patch, the DBA > can still judge safely which file can be removed, from a backup history > file or the return value of pg_start_backup. Sure, it could be addressed with appropriate documentation. You'd effectively get the current behavior by doing a manual CHECKPOINT just before pg_start_backup(), for example. It does still make it harder to use the feature correctly, though. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com