Re: trying again to get incremental backup

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: trying again to get incremental backup
Дата
Msg-id CA+TgmoZpBW2DHeeC--y0H3y1RGsUHb14vinZU4yqBAHtWgv-vg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: trying again to get incremental backup  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: trying again to get incremental backup  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
On Thu, Nov 30, 2023 at 9:33 AM Robert Haas <robertmhaas@gmail.com> wrote:
> 0005: Incremental backup itself. Changes:
> - Remove UPLOAD_MANIFEST replication command and instead add
> INCREMENTAL_WAL_RANGE replication command.

Unfortunately, I think this change is going to need to be reverted.
Jakub reported out a problem to me off-list, which I think boils down
to this: take a full backup on the primary. create a database on the
primary. now take an incremental backup on the standby using the full
backup from the master as the prior backup. What happens at this point
depends on how far replay has progressed on the standby. I think there
are three scenarios: (1) If replay has not yet reached a checkpoint
later than the one at which the full backup began, then taking the
incremental backup will fail. This is correct, because it makes no
sense to take an incremental backup that goes backwards in time, and
it's pointless to take one that goes forwards but not far enough to
reach the next checkpoint, as you won't save anything. (2) If replay
has progressed far enough that the redo pointer is now beyond the
CREATE DATABASE record, then everything is fine. (3) But if the redo
pointer for the backup is a later checkpoint than the one from which
the full backup started, but also before the CREATE DATABASE record,
then the new database's files exist on disk, but are not mentioned in
the WAL summary, which covers all LSNs from the start of the prior
backup to the start of this one. Here, the start of the backup is
basically the LSN from which replay will start, and since the database
was created after that, those changes aren't in the WAL summary. This
means that we think the file is unchanged since the prior backup, and
so backup no blocks at all. But now we have an incremental file for a
relation for which no full file is present in the prior backup, and
we're in big trouble.

If my analysis is correct, this bug should be new in v12. In v11 and
prior, I think that we always included every file that didn't appear
in the prior manifest in full. I didn't really quite know why I was
doing that, which is why I was willing to rip it out and thus remove
the need for the manifest, but now I think it was actually preventing
exactly this problem. This issue, in general, is files that get
created after the start of the backup. By that time, the WAL summary
that drives the backup has already been built, so it doesn't know
anything about the new files. That would be fine if we either (A)
omitted those new files from the backup completely, since replay would
recreate them or (B) backed them up in full, so that there was nothing
relying on them being there in the earlier backup. But an incremental
backup of such a file is no good.

Then I started worrying about whether there were problems in cases
where a file was dropped and recreated with the same name. I *think*
it's OK. If file F is dropped and recreated after being copied into
the full backup but before being copied into the incremental backup,
then there are basically two cases. First, F might be dropped before
the start LSN of the incremental backup; if so, we'll know from the
WAL summary that the limit block is 0 and back up the whole thing.
Second, F might be dropped after the start LSN of the incremental
backup and before it's actually coped. In that case, we'll not know
when backing up the file that it was dropped and recreated, so we'll
back it up incrementally as if that hadn't happened. That's OK as long
as reconstruction doesn't fail, because WAL replay will again drop and
recreate F. And I think reconstruction won't fail: blocks that are in
the incremental file will be taken from there, blocks in the prior
backup file will be taken from there, and blocks in neither place will
be zero-filled. The result is logically incoherent, but replay will
nuke the file anyway, so whatever.

It bugs me a bit that we don't obey the WAL-before-data rule with file
creation, e.g. RelationCreateStorage does smgrcreate() and then
log_smgrcreate(). So in theory we could see a file on disk for which
nothing has been logged yet; it could even happen that the file gets
created before the start LSN of the backup and the log record gets
written afterward. It seems like it would be far more comfortable to
swap the order there, so that if it's on disk, it's definitely in the
WAL. But I haven't yet been able to think of a scenario in which the
current ordering causes a real problem. If we backup a stray file in
full (or, hypothetically, if we skipped it entirely) then nothing will
happen that can't already happen today with full backup; any problems
we end up having are, I think, not new problems. It's only when we
back up a file incrementally that we need to be careful, and the
analsysis is basically the same as before ... whatever we put into an
incremental file will cause *something* to get reconstructed except
when there's no prior file at all. Having the manifest for the prior
backup lets us avoid the incremental-with-no-prior-file scenario. And
as long as *something* gets reconstructed, I think WAL replay will fix
up the rest.

Considering all this, what I'm inclined to do is go and put
UPLOAD_MANIFEST back, instead of INCREMENTAL_WAL_RANGE, and adjust
accordingly. But first: does anybody see more problems here that I may
have missed?

--
Robert Haas
EDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: "Tristan Partin"
Дата:
Сообщение: Re: meson: Stop using deprecated way getting path of files
Следующее
От: Kirill Reshke
Дата:
Сообщение: Re: Extensible storage manager API - SMGR hook Redux