Re: trying again to get incremental backup

Поиск
Список
Период
Сортировка
От Jakub Wartak
Тема Re: trying again to get incremental backup
Дата
Msg-id CAKZiRmygUFo5Fa4y9cUwuhZNEJpP6Ng8AwbQDkYGuRAmK23zvg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: trying again to get incremental backup  (Robert Haas <robertmhaas@gmail.com>)
Ответы Re: trying again to get incremental backup  (Robert Haas <robertmhaas@gmail.com>)
Список pgsql-hackers
Hi Robert,

On Mon, Dec 11, 2023 at 6:08 PM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Fri, Dec 8, 2023 at 5:02 AM Jakub Wartak
> <jakub.wartak@enterprisedb.com> wrote:
> > While we are at it, maybe around the below in PrepareForIncrementalBackup()
> >
> >                 if (tlep[i] == NULL)
> >                         ereport(ERROR,
> >
> > (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> >                                          errmsg("timeline %u found in
> > manifest, but not in this server's history",
> >                                                         range->tli)));
> >
> > we could add
> >
> >     errhint("You might need to start a new full backup instead of
> > incremental one")
> >
> > ?
>
> I can't exactly say that such a hint would be inaccurate, but I think
> the impulse to add it here is misguided. One of my design goals for
> this system is to make it so that you never have to take a new
> incremental backup "just because,"

Did you mean take a new full backup here?

> not even in case of an intervening
> timeline switch. So, all of the errors in this function are warning
> you that you've done something that you really should not have done.
> In this particular case, you've either (1) manually removed the
> timeline history file, and not just any timeline history file but the
> one for a timeline for a backup that you still intend to use as the
> basis for taking an incremental backup or (2) tried to use a full
> backup taken from one server as the basis for an incremental backup on
> a completely different server that happens to share the same system
> identifier, e.g. because you promoted two standbys derived from the
> same original primary and then tried to use a full backup taken on one
> as the basis for an incremental backup taken on the other.
>

Okay, but please consider two other possibilities:

(3) I had a corrupted DB where I've fixed it by running pg_resetwal
and some cronjob just a day later attempted to take incremental and
failed with that error.

(4) I had pg_upgraded (which calls pg_resetwal on fresh initdb
directory) the DB where I had cronjob that just failed with this error

I bet that (4) is going to happen more often than (1), (2) , which
might trigger users to complain on forums, support tickets.

> > > I have a fix for this locally, but I'm going to hold off on publishing
> > > a new version until either there's a few more things I can address all
> > > at once, or until Thomas commits the ubsan fix.
> > >
> >
> > Great, I cannot get it to fail again today, it had to be some dirty
> > state of the testing env. BTW: Thomas has pushed that ubsan fix.
>
> Huzzah, the cfbot likes the patch set now. Here's a new version with
> the promised fix for your non-reproducible issue. Let's see whether
> you and cfbot still like this version.

LGTM, all quick tests work from my end too. BTW: I have also scheduled
the long/large pgbench -s 14000 (~200GB?) - multiple day incremental
test. I'll let you know how it went.

-J.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Corey Huinker
Дата:
Сообщение: Re: Statistics Import and Export
Следующее
От: Ashutosh Bapat
Дата:
Сообщение: Re: Report planning memory in EXPLAIN ANALYZE