Re: pg_combinebackup does not detect missing files

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: pg_combinebackup does not detect missing files
Дата
Msg-id CA+TgmoaVxr_o3mrDBrBcXm3gowr9Qc4ABW-c73NR_201KkDavw@mail.gmail.com
обсуждение исходный текст
Ответ на [MASSMAIL]pg_combinebackup does not detect missing files  (David Steele <david@pgmasters.net>)
Ответы Re: pg_combinebackup does not detect missing files  (David Steele <david@pgmasters.net>)
Список pgsql-hackers
On Wed, Apr 10, 2024 at 9:36 PM David Steele <david@pgmasters.net> wrote:
> I've been playing around with the incremental backup feature trying to
> get a sense of how it can be practically used. One of the first things I
> always try is to delete random files and see what happens.
>
> You can delete pretty much anything you want from the most recent
> incremental backup (not the manifest) and it will not be detected.

Sure, but you can also delete anything you want from the most recent
non-incremental backup and it will also not be detected. There's no
reason at all to hold incremental backup to a higher standard than we
do in general.

> Maybe the answer here is to update the docs to specify that
> pg_verifybackup should be run on all backup directories before
> pg_combinebackup is run. Right now that is not at all clear.

I don't want to make those kinds of prescriptive statements. If you
want to verify the backups that you use as input to pg_combinebackup,
you can use pg_verifybackup to do that, but it's not a requirement.
I'm not averse to having some kind of statement in the documentation
along the lines of "Note that pg_combinebackup does not attempt to
verify that the individual backups are intact; for that, use
pg_verifybackup." But I think it should be blindingly obvious to
everyone that you can't go whacking around the inputs to a program and
expect to get perfectly good output. I know it isn't blindingly
obvious to everyone, which is why I'm not averse to adding something
like what I just mentioned, and maybe it wouldn't be a bad idea to
document in a few other places that you shouldn't randomly remove
files from the data directory of your live cluster, either, because
people seem to keep doing it, but really, the expectation that you
can't just blow files away and expect good things to happen afterward
should hardly need to be stated.

I think it's very easy to go overboard with warnings of this type.
Weird stuff comes to me all the time because people call me when the
weird stuff happens, and I'm guessing that your experience is similar.
But my actual personal experience, as opposed to the cases reported to
me by others, practically never features files evaporating into the
ether. If I read a documentation page for PostgreSQL or any other
piece of software that made it sound like that was a normal
occurrence, I'd question the technical acumen of the authors. And if I
saw such warnings only for one particular feature of a piece of
software and not anywhere else, I'd wonder why the authors of the
software were trying so hard to scare me off the use of that
particular feature. I don't trust at all that incremental backup is
free of bugs -- but I don't trust that all the code anyone else has
written is free of bugs, either.

> Overall I think it is an issue that the combine is being driven from the
> most recent incremental directory rather than from the manifest.

I don't. I considered that design and rejected it for various reasons
that I still believe to be good. Even if I was wrong, we're not going
to start rewriting the implementation a week after feature freeze.

--
Robert Haas
EDB: http://www.enterprisedb.com



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Nazir Bilal Yavuz
Дата:
Сообщение: Re: Typos in the code and README
Следующее
От: Nazir Bilal Yavuz
Дата:
Сообщение: Re: Use streaming read API in ANALYZE