Обсуждение: [MASSMAIL]pg_combinebackup fails on file named INCREMENTAL.*

Поиск
Список
Период
Сортировка

[MASSMAIL]pg_combinebackup fails on file named INCREMENTAL.*

От
David Steele
Дата:
Hackers,

Since incremental backup is using INCREMENTAL as a keyword (rather than 
storing incremental info in the manifest) it is vulnerable to any file 
in PGDATA with the pattern INCREMENTAL.*.

For example:

$ pg_basebackup -c fast -D test/backup/full -F plain
$ touch test/data/INCREMENTAL.CONFIG
$ /home/dev/test/pg/bin/pg_basebackup -c fast -D test/backup/incr1 -F 
plain -i /home/dev/test/backup/full/backup_manifest

$ /home/dev/test/pg/bin/pg_combinebackup test/backup/full 
test/backup/incr1 -o test/backup/combine
pg_combinebackup: error: could not read file 
"test/backup/incr1/INCREMENTAL.CONFIG": read only 0 of 4 bytes
pg_combinebackup: removing output directory "test/backup/combine"

This works anywhere in PGDATA as far as I can see, e.g.

$ touch test/data/base/1/INCREMENTAL.1

Or just by dropping a new file into the incremental backup:

$ touch test/backup/incr1/INCREMENTAL.x
$ /home/dev/test/pg/bin/pg_combinebackup test/backup/full 
test/backup/incr1 -o test/backup/combine
pg_combinebackup: error: could not read file 
"test/backup/incr1/INCREMENTAL.x": read only 0 of 4 bytes
pg_combinebackup: removing output directory "test/backup/combine"

We could fix the issue by forbidding this file pattern in PGDATA, i.e. 
error when it is detected during pg_basebackup, but in my view it would 
be better (at least eventually) to add incremental info to the manifest. 
That would also allow us to skip storing zero-length files and 
incremental stubs (with no changes) as files.

Regards,
-David



Re: pg_combinebackup fails on file named INCREMENTAL.*

От
Robert Haas
Дата:
On Sun, Apr 14, 2024 at 11:53 PM David Steele <david@pgmasters.net> wrote:
> Since incremental backup is using INCREMENTAL as a keyword (rather than
> storing incremental info in the manifest) it is vulnerable to any file
> in PGDATA with the pattern INCREMENTAL.*.

Yeah, that's true. I'm not greatly concerned about this, because I
think anyone who creates a file called INCREMENTAL.CONFIG is just
being perverse. However, we could ignore those files just in subtrees
where they're not expected to occur i.e. only pay attention to them
under base, global, and pg_tblspc. I didn't do this because I thought
someone might eventually want to do something like incremental backup
of SLRU files, and then it would be annoying if there were a bunch of
random pathname restrictions. But if we're really concerned about
this, I think it would be a reasonable fix; you're not really supposed
to decide to store your configuration files in directories that exist
for the purpose of storing relation data files.

> We could fix the issue by forbidding this file pattern in PGDATA, i.e.
> error when it is detected during pg_basebackup, but in my view it would
> be better (at least eventually) to add incremental info to the manifest.
> That would also allow us to skip storing zero-length files and
> incremental stubs (with no changes) as files.

I did think about this, and I do like the idea of being able to elide
incremental stubs. If you have a tremendous number of relations and
very few of them have changed at all, the incremental stubs might take
up a significant percentage of the space consumed by the incremental
backup, which kind of sucks, even if the incremental backup is still
quite small compared to the original database. And, like you, I had
the idea of trying to use the backup_manifest to do it.

But ... I didn't really end up feeling very comfortable with it. Right
now, the backup manifest is something we only use to verify the
integrity of the backup. If we were to do this, it would become a
critical part of the backup. I don't think I like the idea that
removing the backup_manifest should be allowed to, in effect, corrupt
the backup. But I think I dislike even more the idea that the data
that is used to verify the backup gets mushed together with the backup
data itself. Maybe in practice it's fine, but it doesn't seem very
conceptually clean.

There are also some practical considerations that made me not want to
go in this direction: we'd need to make sure that the relevant
information from the backup manifest was available to the
reconstruction process at the right part of the code. When iterating
over a directory, it would need to consider all of the files actually
present in that directory plus any "hallucinated" incremental stubs
from the manifest. I didn't feel confident of my ability to implement
that in the available time without messing anything up.

I think we should consider other possible designs here. For example,
instead of including this in the manifest, we could ship one
INCREMENTAL.STUBS file per directory that contains a list of all of
the incremental stubs that should be created in that directory. My
guess is that something like this will turn out to be way simpler to
code.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: pg_combinebackup fails on file named INCREMENTAL.*

От
David Steele
Дата:
On 4/16/24 06:33, Robert Haas wrote:
> On Sun, Apr 14, 2024 at 11:53 PM David Steele <david@pgmasters.net> wrote:
>> Since incremental backup is using INCREMENTAL as a keyword (rather than
>> storing incremental info in the manifest) it is vulnerable to any file
>> in PGDATA with the pattern INCREMENTAL.*.
> 
> Yeah, that's true. I'm not greatly concerned about this, because I
> think anyone who creates a file called INCREMENTAL.CONFIG is just
> being perverse. 

Well, it's INCREMENTAL.* and you might be surprised (though I doubt it) 
at what users will do. We've been caught out by wacky file names (and 
database names) a few times.

> However, we could ignore those files just in subtrees
> where they're not expected to occur i.e. only pay attention to them
> under base, global, and pg_tblspc. I didn't do this because I thought
> someone might eventually want to do something like incremental backup
> of SLRU files, and then it would be annoying if there were a bunch of
> random pathname restrictions. But if we're really concerned about
> this, I think it would be a reasonable fix; you're not really supposed
> to decide to store your configuration files in directories that exist
> for the purpose of storing relation data files.

I think it would be reasonable to restrict what can be put in base/ and 
global/ but users generally feel free to create whatever they want in 
the root of PGDATA, despite being strongly encouraged not to.

Anyway, I think it should be fixed or documented as a caveat since it 
causes a hard failure on restore.

>> We could fix the issue by forbidding this file pattern in PGDATA, i.e.
>> error when it is detected during pg_basebackup, but in my view it would
>> be better (at least eventually) to add incremental info to the manifest.
>> That would also allow us to skip storing zero-length files and
>> incremental stubs (with no changes) as files.
> 
> I did think about this, and I do like the idea of being able to elide
> incremental stubs. If you have a tremendous number of relations and
> very few of them have changed at all, the incremental stubs might take
> up a significant percentage of the space consumed by the incremental
> backup, which kind of sucks, even if the incremental backup is still
> quite small compared to the original database. And, like you, I had
> the idea of trying to use the backup_manifest to do it.
> 
> But ... I didn't really end up feeling very comfortable with it. Right
> now, the backup manifest is something we only use to verify the
> integrity of the backup. If we were to do this, it would become a
> critical part of the backup. 

For my 2c the manifest is absolutely a critical part of the backup. I'm 
having a hard time even seeing why we have the --no-manifest option for 
pg_combinebackup at all. If the manifest is missing who knows what else 
might be missing as well? If present, why wouldn't we use it?

I know Tomas added some optimizations that work best with --no-manifest 
but if we can eventually read compressed tars (which I expect to be the 
general case) then those optimizations are not very useful.

> I don't think I like the idea that
> removing the backup_manifest should be allowed to, in effect, corrupt
> the backup. But I think I dislike even more the idea that the data
> that is used to verify the backup gets mushed together with the backup
> data itself. Maybe in practice it's fine, but it doesn't seem very
> conceptually clean.

I don't think this is a problem. The manifest can do more than store 
verification info, IMO.

> There are also some practical considerations that made me not want to
> go in this direction: we'd need to make sure that the relevant
> information from the backup manifest was available to the
> reconstruction process at the right part of the code. When iterating
> over a directory, it would need to consider all of the files actually
> present in that directory plus any "hallucinated" incremental stubs
> from the manifest. I didn't feel confident of my ability to implement
> that in the available time without messing anything up.

I think it would be better to iterate over the manifest than files in a 
directory. In any case we still need to fix [1], which presents more or 
less the same problem.

> I think we should consider other possible designs here. For example,
> instead of including this in the manifest, we could ship one
> INCREMENTAL.STUBS file per directory that contains a list of all of
> the incremental stubs that should be created in that directory. My
> guess is that something like this will turn out to be way simpler to
> code.

So we'd store a mini manifest per directory rather than just put the 
info in the main manifest? Sounds like unnecessary complexity to me.

Regards,
-David

[1] 
https://www.postgresql.org/message-id/flat/9badd24d-5bd9-4c35-ba85-4c38a2feb73e%40pgmasters.net



Re: pg_combinebackup fails on file named INCREMENTAL.*

От
Stefan Fercot
Дата:
Hi,

> I think it would be reasonable to restrict what can be put in base/ and
> global/ but users generally feel free to create whatever they want in
> the root of PGDATA, despite being strongly encouraged not to.
>
> Anyway, I think it should be fixed or documented as a caveat since it
> causes a hard failure on restore.

+1. IMHO, no matter how you'd further decide to reference the incremental stubs, the earlier we can mention the failure
tothe user, the better. 
Tbh, I'm not very confortable seeing pg_combinebackup fail when the user would need to restore (whatever the reason). I
mean,failing to take the backup is less of a problem (compared to failing to restore) because then the user might have
thetime to investigate and fix the issue, not being in the hurry of a down production to restore... 

> > But ... I didn't really end up feeling very comfortable with it. Right
> > now, the backup manifest is something we only use to verify the
> > integrity of the backup. If we were to do this, it would become a
> > critical part of the backup.

Isn't it already the case? I mean, you need the manifest of the previous backup to take an incremental one, right?
And shouldn't we encourage to verify the backup sets before (at least) trying to combine them?
It's not because a file was only use for one specific purpose until now that we can't improve it later.
Splitting the meaningful information across multiple places would be more error-prone (for both devs and users) imo.

> > I don't think I like the idea that
> > removing the backup_manifest should be allowed to, in effect, corrupt
> > the backup. But I think I dislike even more the idea that the data
> > that is used to verify the backup gets mushed together with the backup
> > data itself. Maybe in practice it's fine, but it doesn't seem very
> > conceptually clean.

Removing pretty much any file in the backup set would probably corrupt it. I mean, why would people remove the backup
manifestwhen they already can't remove backup_label? 
A doc stating "dont touch anything" is IMHO easier than "this part is critical, not that one".

> I don't think this is a problem. The manifest can do more than store
> verification info, IMO.

+1. Adding more info to the backup manifest can be handy for other use-cases too (i.e. like avoiding to store empty
files,or storing the checksums state of the cluster). 

Kind Regards,
--
Stefan FERCOT
Data Egret (https://dataegret.com)



Re: pg_combinebackup fails on file named INCREMENTAL.*

От
Robert Haas
Дата:
On Tue, Apr 16, 2024 at 3:10 AM Stefan Fercot
<stefan.fercot@protonmail.com> wrote:
> > > But ... I didn't really end up feeling very comfortable with it. Right
> > > now, the backup manifest is something we only use to verify the
> > > integrity of the backup. If we were to do this, it would become a
> > > critical part of the backup.
>
> Isn't it already the case? I mean, you need the manifest of the previous backup to take an incremental one, right?
> And shouldn't we encourage to verify the backup sets before (at least) trying to combine them?
> It's not because a file was only use for one specific purpose until now that we can't improve it later.
> Splitting the meaningful information across multiple places would be more error-prone (for both devs and users) imo.

Well, right now, if you just take a full backup, and you throw away
the backup manifest because you don't care, you have a working full
backup. Furthermore, if you took any incremental backups based on that
full backup before discarding the manifest, you can still restore
them. Now, it is possible that nobody in the world cares about those
properties other than me; I have been known to (a) care about weird
stuff and (b) be a pedant. However, we've already shipped a couple of
releases where backup manifests were thoroughly non-critical: you
needed them to run pg_verifybackup, and for nothing else. I think it's
quite likely that there are users out there who are used to things
working in that way, and I'm not sure that those users will adjust
their expectations when a new feature comes out. I also feel that if I
were a user, I would think of something called a "manifest" as just a
table of contents for whatever the thing was. I still remember
downloading tar files from the Internet in the 1990s and there'd be a
file in the tarball sometimes called MANIFEST which was, you know, a
list of what was in the tarball. You didn't need that file for
anything functional; it was just so you could check if anything was
missing.

What I fear is that this will turn into another situation like we had
with pg_xlog, where people saw "log" in the name and just blew it
away. Matter of fact, I recently encountered one of my few recent
examples of someone doing that thing since the pg_wal renaming
happened. Some users don't take much convincing to remove anything
that looks inessential. And what I'm particularly worried about with
this feature is tar-format backups. If you have a directory format
backup and you do an "ls", you're going to see a whole bunch of files
in there of which backup_manifest will be one. How you treat that file
is just going to depend on what you know about its purpose. But if you
have a tar-format backup, possibly compressed, the backup_manifest
file stands out a lot more. You may have something like this:

backup_manifest root.tar.gz 16384.tar.gz

Well, at this point, it becomes much more likely that you're going to
think that there are special rules for the backup_manifest file.

The kicker for me is that I can't see any reason to do any of this
stuff. Including the information that we need to elide incremental
stubs in some other way, say with one stub-list per directory, will be
easier to implement and probably perform better. Like, I'm not saying
we can't find a way to jam this into the manifest. But I'm fairly sure
it's just making life difficult for ourselves.

I may ultimately lose this argument, as I did the one about whether
the backup_manifest should be JSON or some bespoke format. And that's
fine. I respect your opinion, and David's. But I also reserve the
right to feel differently, and I do. And I would also just gently
point out that my level of motivation to work on a particular feature
can depend quite a bit on whether I'm going to be forced to implement
it in a way that I disagree with.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: pg_combinebackup fails on file named INCREMENTAL.*

От
Robert Haas
Дата:
On Mon, Apr 15, 2024 at 10:12 PM David Steele <david@pgmasters.net> wrote:
> Anyway, I think it should be fixed or documented as a caveat since it
> causes a hard failure on restore.

Alright, I'll look into this.

> I know Tomas added some optimizations that work best with --no-manifest
> but if we can eventually read compressed tars (which I expect to be the
> general case) then those optimizations are not very useful.

My belief is that those optimizations work fine with or without
manifests; you only start to lose the benefit in cases where you use
different checksum types for different backups that you then try to
combine. Which should hopefully be a rare case.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: pg_combinebackup fails on file named INCREMENTAL.*

От
Stefan Fercot
Дата:
On Tuesday, April 16th, 2024 at 3:22 PM, Robert Haas wrote:
> What I fear is that this will turn into another situation like we had
> with pg_xlog, where people saw "log" in the name and just blew it
> away. Matter of fact, I recently encountered one of my few recent
> examples of someone doing that thing since the pg_wal renaming
> happened. Some users don't take much convincing to remove anything
> that looks inessential. And what I'm particularly worried about with
> this feature is tar-format backups. If you have a directory format
> backup and you do an "ls", you're going to see a whole bunch of files
> in there of which backup_manifest will be one. How you treat that file
> is just going to depend on what you know about its purpose. But if you
> have a tar-format backup, possibly compressed, the backup_manifest
> file stands out a lot more. You may have something like this:
>
> backup_manifest root.tar.gz 16384.tar.gz

Sure, I can see your point here and how people could be tempted to through away that backup_manifest if they don't know
howimportant it is to keep it. 
Probably in this case we'd need the list to be inside the tar, just like backup_label and tablespace_map then.

> The kicker for me is that I can't see any reason to do any of this
> stuff. Including the information that we need to elide incremental
> stubs in some other way, say with one stub-list per directory, will be
> easier to implement and probably perform better. Like, I'm not saying
> we can't find a way to jam this into the manifest. But I'm fairly sure
> it's just making life difficult for ourselves.
>
> I may ultimately lose this argument, as I did the one about whether
> the backup_manifest should be JSON or some bespoke format. And that's
> fine. I respect your opinion, and David's. But I also reserve the
> right to feel differently, and I do.

Do you mean 1 stub-list per pgdata + 1 per tablespaces?

Sure, it is important to respect and value each other feelings, I never said otherwise.

I don't really see how it would be faster to recursively go through each sub-directories of the pgdata and tablespaces
togather all the pieces together compared to reading 1 main file. 
But I guess, choosing one option or the other, we will only find out how well it works once people will use it on the
fieldand possibly give some feedback. 

As you mentioned in [1], we're not going to start rewriting the implementation a week after feature freeze nor probably
alreadystart building big new things now anyway. 
So maybe let's start with documenting the possible gotchas/corner cases to make our support life easier in the future.

Kind Regards,
--
Stefan FERCOT
Data Egret (https://dataegret.com)

[1] https://www.postgresql.org/message-id/CA%2BTgmoaVxr_o3mrDBrBcXm3gowr9Qc4ABW-c73NR_201KkDavw%40mail.gmail.com



Re: pg_combinebackup fails on file named INCREMENTAL.*

От
Robert Haas
Дата:
On Tue, Apr 16, 2024 at 12:06 PM Stefan Fercot
<stefan.fercot@protonmail.com> wrote:
> Sure, I can see your point here and how people could be tempted to through away that backup_manifest if they don't
knowhow important it is to keep it. 
> Probably in this case we'd need the list to be inside the tar, just like backup_label and tablespace_map then.

Yeah, I think anywhere inside the tar is better than anywhere outside
the tar, by a mile. I'm happy to leave the specific question of where
inside the tar as something TBD at time of implementation by fiat of
the person doing the work.

But that said ...

> Do you mean 1 stub-list per pgdata + 1 per tablespaces?
>
> I don't really see how it would be faster to recursively go through each sub-directories of the pgdata and
tablespacesto gather all the pieces together compared to reading 1 main file. 
> But I guess, choosing one option or the other, we will only find out how well it works once people will use it on the
fieldand possibly give some feedback. 

The reason why I was suggesting one stub-list per directory is that we
recurse over the directory tree. We reach each directory in turn,
process it, and then move on to the next one. What I imagine that we
want to do is - first iterate over all of the files actually present
in a directory. Then, iterate over the list of stubs for that
directory and do whatever we would have done if there had been a stub
file present for each of those. So, we don't really want a list of
every stub in the whole backup, or even every stub in the whole
tablespace. What we want is to be able to easily get a list of stubs
for a single directory. Which is very easily done if each directory
contains its own stub-list file.

If we instead have a centralized stub-list for the whole tablespace,
or the whole backup, it's still quite possible to make it work. We
just read that centralized stub list and we build an in-memory data
structure that is indexed by containing directory, like a hash table
where the key is the directory name and the value is a list of
filenames within that directory. But, a slight disadvantage of this
model is that you have to keep that whole data structure in memory for
the whole time you're reconstructing, and you have to pass around a
pointer to it everywhere so that the code that handles individual
directories can access it. I'm sure this isn't the end of the world.
It's probably unlikely that someone has so many stub files that the
memory used for such a data structure is painfully high, and even if
they did, it's unlikely that they are spread out across multiple
databases and/or tablespaces in such a way that only needing the data
for one directory at a time would save you. But, it's not impossible
that such a scenario could exist.

Somebody might say - well, don't go directory by directory. Just
handle all of the stubs at the end. But I don't think that really
fixes anything. I want to be able to verify that none of the stubs
listed in the stub-list are also present in the backup as real files,
for sanity checking purposes. It's quite easy to see how to do that in
the design I proposed above: keep a list of the files for each
directory as you read it, and then when you read the stub-list for
that directory, check those lists against each other for duplicates.
Doing this on the level of a whole tablespace or the whole backup is
clearly also possible, but once again it potentially uses more memory,
and there's no functional gain.

Plus, this kind of approach would also make the reconstruction process
"jump around" more. It might pull a bunch of mostly-unchanged files
from the full backup while handling the non-stub files, and then come
back to that directory a second time, much later, when it's processing
the stub-list. Perhaps that would lead to a less-optimal I/O pattern,
or perhaps it would make it harder for the user to understand how much
progress reconstruction had made. Or perhaps it would make no
difference at all; I don't know. Maybe there's even some advantage in
a two-pass approach like this. I don't see one. But it might prove
otherwise on closer examination.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: pg_combinebackup fails on file named INCREMENTAL.*

От
Robert Haas
Дата:
On Tue, Apr 16, 2024 at 9:25 AM Robert Haas <robertmhaas@gmail.com> wrote:
> On Mon, Apr 15, 2024 at 10:12 PM David Steele <david@pgmasters.net> wrote:
> > Anyway, I think it should be fixed or documented as a caveat since it
> > causes a hard failure on restore.
>
> Alright, I'll look into this.

Here's a patch.

--
Robert Haas
EDB: http://www.enterprisedb.com

Вложения

Re: pg_combinebackup fails on file named INCREMENTAL.*

От
David Steele
Дата:
On 4/18/24 00:14, Robert Haas wrote:
> On Tue, Apr 16, 2024 at 9:25 AM Robert Haas <robertmhaas@gmail.com> wrote:
>> On Mon, Apr 15, 2024 at 10:12 PM David Steele <david@pgmasters.net> wrote:
>>> Anyway, I think it should be fixed or documented as a caveat since it
>>> causes a hard failure on restore.
>>
>> Alright, I'll look into this.
> 
> Here's a patch.

Thanks! I've tested this and it works as advertised.

Ideally I'd want an error on backup if there is a similar file in any 
data directories that would cause an error on combine, but I admit that 
it is vanishingly rare for users to put files anywhere but the root of 
PGDATA, so this approach seems OK to me.

Regards,
-David



Re: pg_combinebackup fails on file named INCREMENTAL.*

От
Robert Haas
Дата:
On Wed, Apr 17, 2024 at 7:56 PM David Steele <david@pgmasters.net> wrote:
> Thanks! I've tested this and it works as advertised.
>
> Ideally I'd want an error on backup if there is a similar file in any
> data directories that would cause an error on combine, but I admit that
> it is vanishingly rare for users to put files anywhere but the root of
> PGDATA, so this approach seems OK to me.

OK, committed. Thanks for the review.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: pg_combinebackup fails on file named INCREMENTAL.*

От
David Steele
Дата:
On 4/19/24 01:10, Robert Haas wrote:
> On Wed, Apr 17, 2024 at 7:56 PM David Steele <david@pgmasters.net> wrote:
>> Thanks! I've tested this and it works as advertised.
>>
>> Ideally I'd want an error on backup if there is a similar file in any
>> data directories that would cause an error on combine, but I admit that
>> it is vanishingly rare for users to put files anywhere but the root of
>> PGDATA, so this approach seems OK to me.
> 
> OK, committed. Thanks for the review.

Excellent, thank you!

Regards,
-David