Обсуждение: Improving Physical Backup/Restore within the Low Level API

Поиск
Список
Период
Сортировка

Improving Physical Backup/Restore within the Low Level API

От
"David G. Johnston"
Дата:
Hi!

This email is a first pass at a user-visible design for how our backup and restore process, as enabled by the Low Level API, can be modified to make it more mistake-proof.  In short, it requires pg_start_backup to further expand upon what it means for the system to be in the midst of a backup, pg_stop_backup to reverse those things, and modifying the startup process to deal with the server having crashed while the system is in that backup state.  Notes at the end extend the design to handle concurrent backups.

The core functional changes are:
1) pg_backup_start modifies a newly added "in backup" state flag in pg_control to on.
2) pg_backup_stop modifies that flag back to off.
3) postmaster will refuse to start if that flag is on, unless one of:
  a) crash.signal exists in the data directory
  b) recovery.signal exists in the data directory
  c) standby.signal exists in the data directory
4) Signal file processing causes the in-backup flag in pg_control to be set to off

The newly added crash.signal file is required to handle the case where the server crashes after pg_backup_start and before pg_backup_stop.  It initiates a crash recovery of the instance just as is done today but with the added change of flipping the flag to off when recovery is complete just before going live.

The error message for the failed startup while in backup will tell the dba that one of the three signal files must exist.
When processing recovery.signal or standby.signal the presence of the backup_label and tablespace_map files are mandatory and the system will also fail to start should they be missing.

For non-functional changes I would also suggest doing the following:
pg_backup_start will create a "pg_backup_metadata" directory if there is not already one, or will empty it if there is.
pg_backup_start will create a crash.signal file in that directory
pg_backup_stop  will create files within pg_backup_metadata upon its completion:
backup_label
tablespace_map
recovery.signal
standby.signal

All of the instructions regarding what to place in those files should be removed and instead the system should write them - no copy-paste.

The instructions modified to say "copy the backup_label and tablespace_map files to the root of the backup directory and the recovery and standby signal files to the pg_backup_metadata directory in the backup.  Additionally, we document crash recovery by saying "move crash.signal from pg_backup_metadata to the root of the data directory". We should explicitly advise excluding or removing pg_backup_metadata/crash.signal from the backup as well.

Extending the above to handle concurrent backup, for pg_control we'd sill use the on/off flag but we have to have a shared in-memory session lock on something so that only the last surviving process actually changes it to off while also dealing with sessions that terminate without issuing pg_backup_stop and without the server itself crashing. (I'm unfamiliar with how this is handled today but I presume a mechanism exists already that just needs to be extended).

For the non-functional stuff, pg_backup_start returns a process id, and subdirectories under pg_backup_metadata are created named with such.  Add a pg_backup_cleanup() function that executes while not in backup mode to clean up those subdirectories.  Any subdirectory in the backup that isn't the specified process id from pg_start_backup should be excluded/removed.

David J.

Re: Improving Physical Backup/Restore within the Low Level API

От
Laurenz Albe
Дата:
On Mon, 2023-10-16 at 09:26 -0700, David G. Johnston wrote:
> This email is a first pass at a user-visible design for how our backup and restore
> process, as enabled by the Low Level API, can be modified to make it more mistake-proof.
> In short, it requires pg_start_backup to further expand upon what it means for the
> system to be in the midst of a backup, pg_stop_backup to reverse those things,
> and modifying the startup process to deal with the server having crashed while the
> system is in that backup state.  Notes at the end extend the design to handle concurrent backups.
>
> The core functional changes are:
> 1) pg_backup_start modifies a newly added "in backup" state flag in pg_control to on.
> 2) pg_backup_stop modifies that flag back to off.
> 3) postmaster will refuse to start if that flag is on, unless one of:
>   a) crash.signal exists in the data directory
>   b) recovery.signal exists in the data directory
>   c) standby.signal exists in the data directory
> 4) Signal file processing causes the in-backup flag in pg_control to be set to off
>
> The newly added crash.signal file is required to handle the case where the server
> crashes after pg_backup_start and before pg_backup_stop.  It initiates a crash recovery
> of the instance just as is done today but with the added change of flipping the flag
> to off when recovery is complete just before going live.

I see a couple of problems and/or things that need clarification with that idea:

- Two backups can run concurrently.  How do you reconcile that with the "in backup"
  flag and crash.signal?
- I guess crash.signal is created during pg_start_backup().  So that file will be
  included in the backup.  How do you handle that during recovery?  Ignore it if
  another signal file is present?  And if the user forgets to create a signal file
  for recovery, how do you prevent PostgreSQL from performing crash recovery?

Yours,
Laurenz Albe



Re: Improving Physical Backup/Restore within the Low Level API

От
"David G. Johnston"
Дата:
On Mon, Oct 16, 2023 at 10:26 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Mon, 2023-10-16 at 09:26 -0700, David G. Johnston wrote:
> This email is a first pass at a user-visible design for how our backup and restore
> process, as enabled by the Low Level API, can be modified to make it more mistake-proof.
> In short, it requires pg_start_backup to further expand upon what it means for the
> system to be in the midst of a backup, pg_stop_backup to reverse those things,
> and modifying the startup process to deal with the server having crashed while the
> system is in that backup state.  Notes at the end extend the design to handle concurrent backups.
>
> The core functional changes are:
> 1) pg_backup_start modifies a newly added "in backup" state flag in pg_control to on.
> 2) pg_backup_stop modifies that flag back to off.
> 3) postmaster will refuse to start if that flag is on, unless one of:
>   a) crash.signal exists in the data directory
>   b) recovery.signal exists in the data directory
>   c) standby.signal exists in the data directory
> 4) Signal file processing causes the in-backup flag in pg_control to be set to off
>
> The newly added crash.signal file is required to handle the case where the server
> crashes after pg_backup_start and before pg_backup_stop.  It initiates a crash recovery
> of the instance just as is done today but with the added change of flipping the flag
> to off when recovery is complete just before going live.

I see a couple of problems and/or things that need clarification with that idea:

- Two backups can run concurrently.  How do you reconcile that with the "in backup"
  flag and crash.signal?
- I guess crash.signal is created during pg_start_backup().  So that file will be
  included in the backup.  How do you handle that during recovery?  Ignore it if
  another signal file is present?  And if the user forgets to create a signal file
  for recovery, how do you prevent PostgreSQL from performing crash recovery?


crash.signal is created in the pg_backup_metadata directory, not the root directory.  Should the server crash while any backup is in progress pg_control would be aware of that fact (in_backup=true would still be there, instead of in_backup=false which only comes back after all backups have completed) and the server will not restart without user intervention - specifically, moving the crash.signal file from (one of) the pg_backup_metadata subdirectories to the root directory.  As there is nothing special about the crash.signal files in the pg_backup_metadata subdirectories "touch crash.signal" could be used.

The backed up pg_control file will have in_backup=true (I haven't pondered the torn reads dynamic of this - I'm supposing that placing a copy of pg_control into the pg_backup_metadata directory might be part of solving that problem).

David J.

Re: Improving Physical Backup/Restore within the Low Level API

От
Laurenz Albe
Дата:
On Mon, 2023-10-16 at 11:18 -0700, David G. Johnston wrote:
> > I see a couple of problems and/or things that need clarification with that idea:
> >
> > - Two backups can run concurrently.  How do you reconcile that with the "in backup"
> >   flag and crash.signal?
> > - I guess crash.signal is created during pg_start_backup().  So that file will be
> >   included in the backup.  How do you handle that during recovery?  Ignore it if
> >   another signal file is present?  And if the user forgets to create a signal file
> >   for recovery, how do you prevent PostgreSQL from performing crash recovery?
> >
>
> crash.signal is created in the pg_backup_metadata directory, not the root directory.
> Should the server crash while any backup is in progress pg_control would be aware
> of that fact (in_backup=true would still be there, instead of in_backup=false which
> only comes back after all backups have completed) and the server will not restart
> without user intervention - specifically, moving the crash.signal file from (one of)
> the pg_backup_metadata subdirectories to the root directory.  As there is nothing
> special about the crash.signal files in the pg_backup_metadata subdirectories
> "touch crash.signal" could be used.

I see - I missed the part with the pg_backup_metadata directory.

I think it won't meet with favor if there are cases that require manual intervention
for starting the server.  That was the main argument for getting rid of the exclusive
backup API, which had a similar problem.


Also, how do you envision two concurrent backups with your setup?

Yours,
Laurenz Albe



Re: Improving Physical Backup/Restore within the Low Level API

От
"David G. Johnston"
Дата:
On Mon, Oct 16, 2023 at 12:09 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
I think it won't meet with favor if there are cases that require manual intervention
for starting the server.  That was the main argument for getting rid of the exclusive
backup API, which had a similar problem.

In the rare case of a crash of the source database while one or more databases are in progress.  Restoring the backup requires manual intervention with signal files today.

I get a desire for the live production server to not need intervention to recover from a crash but I can't help but feel that this requirement plus the goal of making this a non-interventionist as possible during recovery are incompatible.  But I haven't given it a great amount of thought as I felt the limited scope and situation were an acceptable cost for keeping the process straight-forward (i.e., starting up a backup mode instance requires a signal file that dictates the kind of recovery to perform).  We can either make the live backup contents invalid until something happens after pg_backup_stop ends that makes it valid or we have to make the current system being backed up invalid so long as it's in backup mode.  The later seemed easier and doesn't require actions outside of our control.


Also, how do you envision two concurrent backups with your setup?

I don't know if I understand the question - if ensuring that "in backup" is turned on when the first backup starts and is turned off when the last backup ends isn't sufficient for concurrent usage I don't know what else I need to deal with.  Apparently concurrent backups already work today and I'm not seeing how, aside from the process ids for the metadata directories (i.e., the user needs to remove all but their own process subdirectory from pg_backup_metadata) and state flag they wouldn't continue to work as-is.

David J.

Re: Improving Physical Backup/Restore within the Low Level API

От
"David G. Johnston"
Дата:
On Mon, Oct 16, 2023 at 12:36 PM David G. Johnston <david.g.johnston@gmail.com> wrote:
On Mon, Oct 16, 2023 at 12:09 PM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
I think it won't meet with favor if there are cases that require manual intervention
for starting the server.  That was the main argument for getting rid of the exclusive
backup API, which had a similar problem.

In the rare case of a crash of the source database while one or more databases are in progress.

Or even more simply, just document that should the process executing pg_backup_start, and eventually pg_backup_end, that noticed its session die out from under it, should just add crash.signal to the data directory (there probably can be a bit more intelligence involved in case the session crash was isolated).  A normal server shutdown should remove any crash.signal files it sees (and ensure in_backup="false"...).  A non-normal shutdown is going to end up in crash recovery anyway so having the signal file there won't harm anything even if pg_control is showing "in_backup=false".

In short, I probably don't know the details well enough to code the solution but this seems solvable for those users that need automatic reboot and crash recovery during an incomplete backup.  But no, by default, and probably so far as pg_basebackup is concerned, a server crash during backup results in requiring outside intervention in order to get the server to restart.  It specifically requires creation of crash.signal, the specific method being unimportant and its contents being fixed - whether empty or otherwise.

David J.

Re: Improving Physical Backup/Restore within the Low Level API

От
Robert Haas
Дата:
On Mon, Oct 16, 2023 at 5:21 PM David G. Johnston
<david.g.johnston@gmail.com> wrote:
> But no, by default, and probably so far as pg_basebackup is concerned, a server crash during backup results in
requiringoutside intervention in order to get the server to restart. 

Others may differ, but I think such a proposal is dead on arrival. As
Laurenz says, that's just reinventing one of the main problems with
exclusive backup mode.

The underlying issue here is that, fundamentally, there's no way for
postgres itself to tell the difference between the backup directory on
the primary and an exact copy of it on a standby. There has to be some
mechanism by which the user tells us whether this is the original
directory or a clone of it -- and that's what backup_label,
recovery.signal, and standby.signal are for. Your proposal rejiggers
the details of how we distinguish primary from standby, but it
doesn't, and can't, avoid the need for users to actually follow the
directions, and I don't see why they'd be any more likely to follow
the directions that this proposal would require than the directions
we're giving them now.

I wish I had a better idea here, because the status quo is definitely
not great. The only thought that really occurs to me is that we might
do better if PostgreSQL did more of the work itself and left fewer
steps to the user to perform. If you could click the "take a backup
here" button and the "restore a backup there" button and not think
about what was actually happening, you'd not have the opportunity to
mess up. But, as I understand it, the main motivation for the
continued existence of the low-level API is that the data directory
might be really big, and you might need to clone it using some kind of
special magic that your system has available instead of copying all
the bytes. And that makes it hard to move more of the responsibility
into PostgreSQL itself, because we don't know how that special magic
works.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Improving Physical Backup/Restore within the Low Level API

От
David Steele
Дата:
On 10/17/23 14:28, Robert Haas wrote:
> On Mon, Oct 16, 2023 at 5:21 PM David G. Johnston
> <david.g.johnston@gmail.com> wrote:
>> But no, by default, and probably so far as pg_basebackup is concerned, a server crash during backup results in
requiringoutside intervention in order to get the server to restart.
 
> 
> Others may differ, but I think such a proposal is dead on arrival. As
> Laurenz says, that's just reinventing one of the main problems with
> exclusive backup mode.

I concur -- this proposal resurrects the issues we had with exclusive 
backups without solving the issues being debated elsewhere, e.g. torn 
reads of pg_control or users removing backup_label when they should not.

Regards,
-David



Re: Improving Physical Backup/Restore within the Low Level API

От
"David G. Johnston"
Дата:
On Tue, Oct 17, 2023 at 12:30 PM David Steele <david@pgmasters.net> wrote:
On 10/17/23 14:28, Robert Haas wrote:
> On Mon, Oct 16, 2023 at 5:21 PM David G. Johnston
> <david.g.johnston@gmail.com> wrote:
>> But no, by default, and probably so far as pg_basebackup is concerned, a server crash during backup results in requiring outside intervention in order to get the server to restart.
>
> Others may differ, but I think such a proposal is dead on arrival. As
> Laurenz says, that's just reinventing one of the main problems with
> exclusive backup mode.

I concur -- this proposal resurrects the issues we had with exclusive
backups without solving the issues being debated elsewhere, e.g. torn
reads of pg_control or users removing backup_label when they should not.


Thank you all for the feedback.

Admittedly I don't understand the problem of torn reads well enough to solve it here but I figured by moving the "must not remove" stuff out of backup_label and into pg_control the odds of it being removed from the backup and the backup still booting basically go to zero.  I do agree that renaming backup_label to something like "recovery_stuff_do_not_delete.conf" probably does that just as well without the downside.

Placing a copy of all relevant files into pg_backup_metadata seems like a decent shield against accidents and a way to reliably self-document the backup even if the behavioral changes are not desired.  Though doing that and handling multiple concurrent backups probably makes the cost too high to move away from relying just on documentation.

I suppose I'd consider having to add one file to the data directory to be an improvement over having to remove two of them - in terms of what it takes to recover from system failure during a backup.

David J