Обсуждение: [MASSMAIL]Add notes to pg_combinebackup docs

Поиск
Список
Период
Сортировка

[MASSMAIL]Add notes to pg_combinebackup docs

От
Martín Marqués
Дата:
Hello,

While doing some work/research on the new incremental backup feature
some limitations were not listed in the docs. Mainly the fact that
pg_combienbackup works with plain format and not tar.

Around the same time, Tomas Vondra tested incremental backups with a
cluster where he enabled checksums after taking the previous full
backup. After combining the backups the synthetic backup had pages
with checksums and other pages without checksums which ended in
checksum errors.

I've attached two patches, the first one is just neat-picking things I
found when I first read the docs. The second has a note on the two
limitations listed above. The limitation on incremental backups of a
cluster that had checksums enabled after the previous backup, I was
not sure if that should go in pg_basebackup or pg_combienbackup
reference documentation. Or maybe somewhere else.

Kind regards, Martín

--
Martín Marqués
It’s not that I have something to hide,
it’s that I have nothing I want you to see

Вложения

Re: Add notes to pg_combinebackup docs

От
Tomas Vondra
Дата:

On 4/9/24 09:59, Martín Marqués wrote:
> Hello,
> 
> While doing some work/research on the new incremental backup feature
> some limitations were not listed in the docs. Mainly the fact that
> pg_combienbackup works with plain format and not tar.
> 

Right. The docs mostly imply this by talking about output directory and
backup directories, but making it more explicit would not hurt.

FWIW it'd be great if we could make incremental backups work with tar
format in the future too. People probably don't want to keep around the
expanded data directory or extract everything before combining the
backups is not very convenient. Reading and writing the tar would make
this simpler.

> Around the same time, Tomas Vondra tested incremental backups with a
> cluster where he enabled checksums after taking the previous full
> backup. After combining the backups the synthetic backup had pages
> with checksums and other pages without checksums which ended in
> checksum errors.
> 

I'm not sure just documenting this limitation is sufficient. We can't
make the incremental backups work in this case (it's as if someone
messes with cluster without writing stuff into WAL), but I think we
should do better than silently producing (seemingly) corrupted backups.

I say seemingly, because the backup is actually fine, the only problem
is it has checksums enabled in the controlfile, but the pages from the
full backup (and the early incremental backups) have no checksums.

What we could do is detect this in pg_combinebackup, and either just
disable checksums with a warning and hint to maybe enable them again. Or
maybe just print that the user needs to disable them.

I was thinking maybe we could detect this while taking the backups, and
force taking a full backup if checksums got enabled since the last
backup. But we can't do that because we only have the manifest from the
last backup, and the manifest does not include info about checksums.

It's a bit unfortunate we don't have a way to enable checksums online.
That'd not have this problem IIRC, because it writes proper WAL. Maybe
it's time to revive that idea ... I recall there were some concerns
about tracking progress to allow resuming stuff, but maybe not having
anything because in some (rare?) cases it'd need to do more work does
not seem like a great trade off.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Add notes to pg_combinebackup docs

От
David Steele
Дата:
On 4/9/24 19:44, Tomas Vondra wrote:
> 
> On 4/9/24 09:59, Martín Marqués wrote:
>> Hello,
>>
>> While doing some work/research on the new incremental backup feature
>> some limitations were not listed in the docs. Mainly the fact that
>> pg_combienbackup works with plain format and not tar.
>>
> 
> Right. The docs mostly imply this by talking about output directory and
> backup directories, but making it more explicit would not hurt.
> 
> FWIW it'd be great if we could make incremental backups work with tar
> format in the future too. People probably don't want to keep around the
> expanded data directory or extract everything before combining the
> backups is not very convenient. Reading and writing the tar would make
> this simpler.

I have a hard time seeing this feature as being very useful, especially 
for large databases, until pg_combinebackup works on tar (and compressed 
tar). Right now restoring an incremental requires at least twice the 
space of the original cluster, which is going to take a lot of users by 
surprise.

I know you have made some improvements here for COW filesystems, but my 
experience is that Postgres is generally not run on such filesystems, 
though that is changing a bit.

>> Around the same time, Tomas Vondra tested incremental backups with a
>> cluster where he enabled checksums after taking the previous full
>> backup. After combining the backups the synthetic backup had pages
>> with checksums and other pages without checksums which ended in
>> checksum errors.
> 
> I'm not sure just documenting this limitation is sufficient. We can't
> make the incremental backups work in this case (it's as if someone
> messes with cluster without writing stuff into WAL), but I think we
> should do better than silently producing (seemingly) corrupted backups.
> 
> I say seemingly, because the backup is actually fine, the only problem
> is it has checksums enabled in the controlfile, but the pages from the
> full backup (and the early incremental backups) have no checksums.
> 
> What we could do is detect this in pg_combinebackup, and either just
> disable checksums with a warning and hint to maybe enable them again. Or
> maybe just print that the user needs to disable them.
> 
> I was thinking maybe we could detect this while taking the backups, and
> force taking a full backup if checksums got enabled since the last
> backup. But we can't do that because we only have the manifest from the
> last backup, and the manifest does not include info about checksums.

I'd say making a new full backup is the right thing to do in this case. 
It should be easy enough to store the checksum state of the cluster in 
the manifest.

Regards,
-David



Re: Add notes to pg_combinebackup docs

От
Tomas Vondra
Дата:
On 4/11/24 02:01, David Steele wrote:
> On 4/9/24 19:44, Tomas Vondra wrote:
>>
>> On 4/9/24 09:59, Martín Marqués wrote:
>>> Hello,
>>>
>>> While doing some work/research on the new incremental backup feature
>>> some limitations were not listed in the docs. Mainly the fact that
>>> pg_combienbackup works with plain format and not tar.
>>>
>>
>> Right. The docs mostly imply this by talking about output directory and
>> backup directories, but making it more explicit would not hurt.
>>
>> FWIW it'd be great if we could make incremental backups work with tar
>> format in the future too. People probably don't want to keep around the
>> expanded data directory or extract everything before combining the
>> backups is not very convenient. Reading and writing the tar would make
>> this simpler.
> 
> I have a hard time seeing this feature as being very useful, especially
> for large databases, until pg_combinebackup works on tar (and compressed
> tar). Right now restoring an incremental requires at least twice the
> space of the original cluster, which is going to take a lot of users by
> surprise.
> 

I do agree it'd be nice if pg_combinebackup worked with .tar directly,
without having to extract the directories first. No argument there, but
as I said in the other thread, I believe that's something we can add
later. That's simply how incremental development works.

I can certainly imagine other ways to do pg_combinebackup, e.g. by
"merging" the increments into the data directory, instead of creating a
copy. But again, I don't think that has to be in v1.

> I know you have made some improvements here for COW filesystems, but my
> experience is that Postgres is generally not run on such filesystems,
> though that is changing a bit.
> 

I'd say XFS is a pretty common choice, for example. And it's one of the
filesystems that work great with pg_combinebackup.

However, who says this has to be the filesystem the Postgres instance
runs on? Who in their right mind put backups on the same volume as the
instance anyway? At which point it can be a different filesystem, even
if it's not ideal for running the database.

FWIW I think it's fine to tell users that to minimize the disk space
requirements, they should use a CoW filesystem and --copy-file-range.
The docs don't say that currently, that's true.

All of this also depends on how people do the restore. With the CoW
stuff they can do a quick (and small) copy on the backup server, and
then copy the result to the actual instance. Or they can do restore on
the target directly (e.g. by mounting a r/o volume with backups), in
which case the CoW won't really help.

But yeah, having to keep the backups as expanded directories is not
great, I'd love to have .tar. Not necessarily because of the disk space
(in my experience the compression in filesystems works quite well for
this purpose), but mostly because it's more compact and allows working
with backups as a single piece of data (e.g. it's much cleared what the
checksum of a single .tar is, compared to a directory).

>>> Around the same time, Tomas Vondra tested incremental backups with a
>>> cluster where he enabled checksums after taking the previous full
>>> backup. After combining the backups the synthetic backup had pages
>>> with checksums and other pages without checksums which ended in
>>> checksum errors.
>>
>> I'm not sure just documenting this limitation is sufficient. We can't
>> make the incremental backups work in this case (it's as if someone
>> messes with cluster without writing stuff into WAL), but I think we
>> should do better than silently producing (seemingly) corrupted backups.
>>
>> I say seemingly, because the backup is actually fine, the only problem
>> is it has checksums enabled in the controlfile, but the pages from the
>> full backup (and the early incremental backups) have no checksums.
>>
>> What we could do is detect this in pg_combinebackup, and either just
>> disable checksums with a warning and hint to maybe enable them again. Or
>> maybe just print that the user needs to disable them.
>>
>> I was thinking maybe we could detect this while taking the backups, and
>> force taking a full backup if checksums got enabled since the last
>> backup. But we can't do that because we only have the manifest from the
>> last backup, and the manifest does not include info about checksums.
> 
> I'd say making a new full backup is the right thing to do in this case.
> It should be easy enough to store the checksum state of the cluster in
> the manifest.
> 

Agreed.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Add notes to pg_combinebackup docs

От
David Steele
Дата:

On 4/11/24 20:51, Tomas Vondra wrote:
> On 4/11/24 02:01, David Steele wrote:
>>
>> I have a hard time seeing this feature as being very useful, especially
>> for large databases, until pg_combinebackup works on tar (and compressed
>> tar). Right now restoring an incremental requires at least twice the
>> space of the original cluster, which is going to take a lot of users by
>> surprise.
> 
> I do agree it'd be nice if pg_combinebackup worked with .tar directly,
> without having to extract the directories first. No argument there, but
> as I said in the other thread, I believe that's something we can add
> later. That's simply how incremental development works.

OK, sure, but if the plan is to make it practical later doesn't that 
make the feature something to be avoided now?

>> I know you have made some improvements here for COW filesystems, but my
>> experience is that Postgres is generally not run on such filesystems,
>> though that is changing a bit.
> 
> I'd say XFS is a pretty common choice, for example. And it's one of the
> filesystems that work great with pg_combinebackup.

XFS has certainly advanced more than I was aware.

> However, who says this has to be the filesystem the Postgres instance
> runs on? Who in their right mind put backups on the same volume as the
> instance anyway? At which point it can be a different filesystem, even
> if it's not ideal for running the database.

My experience is these days backups are generally placed in object 
stores. Sure, people are still using NFS but admins rarely have much 
control over those volumes. They may or not be COW filesystems.

> FWIW I think it's fine to tell users that to minimize the disk space
> requirements, they should use a CoW filesystem and --copy-file-range.
> The docs don't say that currently, that's true.

That would probably be a good addition to the docs.

> All of this also depends on how people do the restore. With the CoW
> stuff they can do a quick (and small) copy on the backup server, and
> then copy the result to the actual instance. Or they can do restore on
> the target directly (e.g. by mounting a r/o volume with backups), in
> which case the CoW won't really help.

And again, this all requires a significant amount of setup and tooling. 
Obviously I believe good backup requires effort but doing this right 
gets very complicated due to the limitations of the tool.

> But yeah, having to keep the backups as expanded directories is not
> great, I'd love to have .tar. Not necessarily because of the disk space
> (in my experience the compression in filesystems works quite well for
> this purpose), but mostly because it's more compact and allows working
> with backups as a single piece of data (e.g. it's much cleared what the
> checksum of a single .tar is, compared to a directory).

But again, object stores are commonly used for backup these days and 
billing is based on data stored rather than any compression that can be 
done on the data. Of course, you'd want to store the compressed tars in 
the object store, but that does mean storing an expanded copy somewhere 
to do pg_combinebackup.

But if the argument is that all this can/will be fixed in the future, I 
guess the smart thing for users to do is wait a few releases for 
incremental backups to become a practical feature.

Regards,
-David



Re: Add notes to pg_combinebackup docs

От
Magnus Hagander
Дата:
On Fri, Apr 12, 2024 at 12:14 AM David Steele <david@pgmasters.net> wrote:


On 4/11/24 20:51, Tomas Vondra wrote:
> On 4/11/24 02:01, David Steele wrote:
>>
>> I have a hard time seeing this feature as being very useful, especially
>> for large databases, until pg_combinebackup works on tar (and compressed
>> tar). Right now restoring an incremental requires at least twice the
>> space of the original cluster, which is going to take a lot of users by
>> surprise.
>
> I do agree it'd be nice if pg_combinebackup worked with .tar directly,
> without having to extract the directories first. No argument there, but
> as I said in the other thread, I believe that's something we can add
> later. That's simply how incremental development works.

OK, sure, but if the plan is to make it practical later doesn't that
make the feature something to be avoided now?

That could be said for any feature. When we shipped streaming replication, the plan was to support synchronous in the future. Should we not have shipped it, or told people to avoid it? 

Sure, the current state limits it's uses in some cases. But it still leaves a bunch of other cases where it works just fine.


 
>> I know you have made some improvements here for COW filesystems, but my
>> experience is that Postgres is generally not run on such filesystems,
>> though that is changing a bit.
>
> I'd say XFS is a pretty common choice, for example. And it's one of the
> filesystems that work great with pg_combinebackup.

XFS has certainly advanced more than I was aware.

And it happens to be the default on at least one of our most common platforms.


> However, who says this has to be the filesystem the Postgres instance
> runs on? Who in their right mind put backups on the same volume as the
> instance anyway? At which point it can be a different filesystem, even
> if it's not ideal for running the database.

My experience is these days backups are generally placed in object
stores. Sure, people are still using NFS but admins rarely have much
control over those volumes. They may or not be COW filesystems.

If it's mounted through NFS I assume pg_combinebackup won't actually be able to use the COW features? Or does that actually work through NFS?

Mounted LUNs on a SAN I find more common today though, and there it would do a fine job.
 

> FWIW I think it's fine to tell users that to minimize the disk space
> requirements, they should use a CoW filesystem and --copy-file-range.
> The docs don't say that currently, that's true.

That would probably be a good addition to the docs.

+1, that would be a good improvement.


> All of this also depends on how people do the restore. With the CoW
> stuff they can do a quick (and small) copy on the backup server, and
> then copy the result to the actual instance. Or they can do restore on
> the target directly (e.g. by mounting a r/o volume with backups), in
> which case the CoW won't really help.

And again, this all requires a significant amount of setup and tooling.
Obviously I believe good backup requires effort but doing this right
gets very complicated due to the limitations of the tool.

It clearly needs to be documented that there are space needs. But temporarily getting space for something like that is not very complicated in most environments. But you do have to be aware of it.

Generally speaking it's already the case that the "restore experience" with pg_basebackup is far from great. We don't have a "pg_baserestore". You still have to deal with archive_command and restore_command, which we all know can be easy to get wrong. I don't see how this is fundamentally worse than that.

Personally, I tend to recommend that "if you want PITR and thus need to mess with archive_command etc, you should use a backup tool like pg_backrest. If you're fine with just daily backups or whatnot, use pg_basebackup". The incremental backup story fits somewhere in between, but I'd still say this is (today) primarily a tool directed at those that don't need full PITR. 


> But yeah, having to keep the backups as expanded directories is not
> great, I'd love to have .tar. Not necessarily because of the disk space
> (in my experience the compression in filesystems works quite well for
> this purpose), but mostly because it's more compact and allows working
> with backups as a single piece of data (e.g. it's much cleared what the
> checksum of a single .tar is, compared to a directory).

But again, object stores are commonly used for backup these days and
billing is based on data stored rather than any compression that can be
done on the data. Of course, you'd want to store the compressed tars in
the object store, but that does mean storing an expanded copy somewhere
to do pg_combinebackup.

Object stores are definitely getting more common. I wish they were getting a lot more common than they actually are, because they simplify a lot.  But they're in my experience still very far from being a majority.


But if the argument is that all this can/will be fixed in the future, I
guess the smart thing for users to do is wait a few releases for
incremental backups to become a practical feature.

There's always going to be another set of goalposts further ahead. I think it can still be practical for quite a few people.

I'm more worried about the issue you raised in the other thread about missing files, for example...
 
--

Re: Add notes to pg_combinebackup docs

От
Magnus Hagander
Дата:
On Tue, Apr 9, 2024 at 11:46 AM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:


On 4/9/24 09:59, Martín Marqués wrote:
> Hello,
>
> While doing some work/research on the new incremental backup feature
> some limitations were not listed in the docs. Mainly the fact that
> pg_combienbackup works with plain format and not tar.
>

Right. The docs mostly imply this by talking about output directory and
backup directories, but making it more explicit would not hurt.

FWIW it'd be great if we could make incremental backups work with tar
format in the future too. People probably don't want to keep around the
expanded data directory or extract everything before combining the
backups is not very convenient. Reading and writing the tar would make
this simpler.

> Around the same time, Tomas Vondra tested incremental backups with a
> cluster where he enabled checksums after taking the previous full
> backup. After combining the backups the synthetic backup had pages
> with checksums and other pages without checksums which ended in
> checksum errors.
>

I'm not sure just documenting this limitation is sufficient. We can't
make the incremental backups work in this case (it's as if someone
messes with cluster without writing stuff into WAL), but I think we
should do better than silently producing (seemingly) corrupted backups

+1. I think that should be an open item that needs to get sorted.


I say seemingly, because the backup is actually fine, the only problem
is it has checksums enabled in the controlfile, but the pages from the
full backup (and the early incremental backups) have no checksums.

What we could do is detect this in pg_combinebackup, and either just
disable checksums with a warning and hint to maybe enable them again. Or
maybe just print that the user needs to disable them.

I don't think either of these should be done automatically. Something like pg_combinebackup simply failing and requiring you to say "--disable-checksums" to have it do that would be the way to go, IMO.  (once we can reliably detect it of course)


I was thinking maybe we could detect this while taking the backups, and
force taking a full backup if checksums got enabled since the last
backup. But we can't do that because we only have the manifest from the
last backup, and the manifest does not include info about checksums.

Can we forcibly read and parse it out of pg_control?


It's a bit unfortunate we don't have a way to enable checksums online.
That'd not have this problem IIRC, because it writes proper WAL. Maybe
it's time to revive that idea ... I recall there were some concerns
about tracking progress to allow resuming stuff, but maybe not having
anything because in some (rare?) cases it'd need to do more work does
not seem like a great trade off.


For that one I still think it would be perfectly acceptable to have no resume at all, but that's a whole different topic :)

--

Re: Add notes to pg_combinebackup docs

От
David Steele
Дата:
On 4/12/24 19:09, Magnus Hagander wrote:
> On Fri, Apr 12, 2024 at 12:14 AM David Steele <david@pgmasters.net 
> 
>     OK, sure, but if the plan is to make it practical later doesn't that
>     make the feature something to be avoided now?
> 
> 
> That could be said for any feature. When we shipped streaming 
> replication, the plan was to support synchronous in the future. Should 
> we not have shipped it, or told people to avoid it?

This doesn't seem like a great example. Synchronous rep is by far the 
more used mode in my experience. I actively dissuade people from using 
sync rep because of the downsides. More people think they need it than 
actually need it.

>      > However, who says this has to be the filesystem the Postgres instance
>      > runs on? Who in their right mind put backups on the same volume
>     as the
>      > instance anyway? At which point it can be a different filesystem,
>     even
>      > if it's not ideal for running the database.
> 
>     My experience is these days backups are generally placed in object
>     stores. Sure, people are still using NFS but admins rarely have much
>     control over those volumes. They may or not be COW filesystems.
> 
> 
> If it's mounted through NFS I assume pg_combinebackup won't actually be 
> able to use the COW features? Or does that actually work through NFS?

Pretty sure it won't work via NFS, but I was wrong about XFS, so...

> Mounted LUNs on a SAN I find more common today though, and there it 
> would do a fine job.

Huh, interesting. This is a case I almost never see anymore.

>      > All of this also depends on how people do the restore. With the CoW
>      > stuff they can do a quick (and small) copy on the backup server, and
>      > then copy the result to the actual instance. Or they can do
>     restore on
>      > the target directly (e.g. by mounting a r/o volume with backups), in
>      > which case the CoW won't really help.
> 
>     And again, this all requires a significant amount of setup and tooling.
>     Obviously I believe good backup requires effort but doing this right
>     gets very complicated due to the limitations of the tool.
> 
> It clearly needs to be documented that there are space needs. But 
> temporarily getting space for something like that is not very 
> complicated in most environments. But you do have to be aware of it.

We find many environments ridiculously tight on space. There is a 
constant fight with customers/users to even get the data/WAL volumes 
sized correctly.

For small databases it is probably not an issue, but this feature really 
shines with very large databases.

> Generally speaking it's already the case that the "restore experience" 
> with pg_basebackup is far from great. We don't have a "pg_baserestore". 
> You still have to deal with archive_command and restore_command, which 
> we all know can be easy to get wrong. I don't see how this is 
> fundamentally worse than that.

I pretty much agree with this statement. pg_basebackup is already hard 
to use effectively. Now it is just optionally harder.

> Personally, I tend to recommend that "if you want PITR and thus need to 
> mess with archive_command etc, you should use a backup tool like 
> pg_backrest. If you're fine with just daily backups or whatnot, use 
> pg_basebackup". The incremental backup story fits somewhere in between, 
> but I'd still say this is (today) primarily a tool directed at those 
> that don't need full PITR.

Yeah, there are certainly cases where PITR is not required, but they 
still seem to be in the minority. PITR cannot be disabled for the most 
recent backup in pgBackRest and we've had few complaints about that overall.

>      > But yeah, having to keep the backups as expanded directories is not
>      > great, I'd love to have .tar. Not necessarily because of the disk
>     space
>      > (in my experience the compression in filesystems works quite well for
>      > this purpose), but mostly because it's more compact and allows
>     working
>      > with backups as a single piece of data (e.g. it's much cleared
>     what the
>      > checksum of a single .tar is, compared to a directory).
> 
>     But again, object stores are commonly used for backup these days and
>     billing is based on data stored rather than any compression that can be
>     done on the data. Of course, you'd want to store the compressed tars in
>     the object store, but that does mean storing an expanded copy somewhere
>     to do pg_combinebackup.
> 
> Object stores are definitely getting more common. I wish they were 
> getting a lot more common than they actually are, because they simplify 
> a lot.  But they're in my experience still very far from being a majority.

I see it the other way, especially the last few years. The majority seem 
to be object stores followed up closely by NFS. Directly mounted storage 
on the backup host appears to be rarer.

>     But if the argument is that all this can/will be fixed in the future, I
>     guess the smart thing for users to do is wait a few releases for
>     incremental backups to become a practical feature.
> 
> There's always going to be another set of goalposts further ahead. I 
> think it can still be practical for quite a few people.

Since barman uses pg_basebackup in certain cases I imagine that will end 
up being the way most users access this feature.

> I'm more worried about the issue you raised in the other thread about 
> missing files, for example...

Me, too.

Regards,
-David



Re: Add notes to pg_combinebackup docs

От
Tomas Vondra
Дата:
On 4/12/24 11:50, David Steele wrote:
> On 4/12/24 19:09, Magnus Hagander wrote:
>> On Fri, Apr 12, 2024 at 12:14 AM David Steele <david@pgmasters.net
>>
>> ...>>
>>      > But yeah, having to keep the backups as expanded directories is
>> not
>>      > great, I'd love to have .tar. Not necessarily because of the disk
>>     space
>>      > (in my experience the compression in filesystems works quite
>> well for
>>      > this purpose), but mostly because it's more compact and allows
>>     working
>>      > with backups as a single piece of data (e.g. it's much cleared
>>     what the
>>      > checksum of a single .tar is, compared to a directory).
>>
>>     But again, object stores are commonly used for backup these days and
>>     billing is based on data stored rather than any compression that
>> can be
>>     done on the data. Of course, you'd want to store the compressed
>> tars in
>>     the object store, but that does mean storing an expanded copy
>> somewhere
>>     to do pg_combinebackup.
>>
>> Object stores are definitely getting more common. I wish they were
>> getting a lot more common than they actually are, because they
>> simplify a lot.  But they're in my experience still very far from
>> being a majority.
> 
> I see it the other way, especially the last few years. The majority seem
> to be object stores followed up closely by NFS. Directly mounted storage
> on the backup host appears to be rarer.
> 

One thing I'd mention is that not having built-in support for .tar and
.tgz backups does not mean it's impossible to use pg_combinebackup with
archives. You can mount them using e.g. "ratarmount" and then use that
as source directories for pg_combinebackup.

It's not entirely friction-less because AFAICS it's necessary to do the
backup in plain format and then do the .tar to have the expected "flat"
directory structure (and not manifest + 2x tar). But other than that it
seems to work fine (based on my limited testing).


FWIW the "archivemount" performs terribly, so adding this capability
into pg_combinebackup is clearly far from trivial.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Add notes to pg_combinebackup docs

От
Tomas Vondra
Дата:

On 4/12/24 11:12, Magnus Hagander wrote:
> On Tue, Apr 9, 2024 at 11:46 AM Tomas Vondra <tomas.vondra@enterprisedb.com>
> wrote:
> 
>>
>>
>> On 4/9/24 09:59, Martín Marqués wrote:
>>> Hello,
>>>
>>> While doing some work/research on the new incremental backup feature
>>> some limitations were not listed in the docs. Mainly the fact that
>>> pg_combienbackup works with plain format and not tar.
>>>
>>
>> Right. The docs mostly imply this by talking about output directory and
>> backup directories, but making it more explicit would not hurt.
>>
>> FWIW it'd be great if we could make incremental backups work with tar
>> format in the future too. People probably don't want to keep around the
>> expanded data directory or extract everything before combining the
>> backups is not very convenient. Reading and writing the tar would make
>> this simpler.
>>
>>> Around the same time, Tomas Vondra tested incremental backups with a
>>> cluster where he enabled checksums after taking the previous full
>>> backup. After combining the backups the synthetic backup had pages
>>> with checksums and other pages without checksums which ended in
>>> checksum errors.
>>>
>>
>> I'm not sure just documenting this limitation is sufficient. We can't
>> make the incremental backups work in this case (it's as if someone
>> messes with cluster without writing stuff into WAL), but I think we
>> should do better than silently producing (seemingly) corrupted backups
> 
> 
> +1. I think that should be an open item that needs to get sorted.
> 
> 
> I say seemingly, because the backup is actually fine, the only problem
>> is it has checksums enabled in the controlfile, but the pages from the
>> full backup (and the early incremental backups) have no checksums.
>>
>> What we could do is detect this in pg_combinebackup, and either just
>> disable checksums with a warning and hint to maybe enable them again. Or
>> maybe just print that the user needs to disable them.
>>
> 
> I don't think either of these should be done automatically. Something like
> pg_combinebackup simply failing and requiring you to say
> "--disable-checksums" to have it do that would be the way to go, IMO.
> (once we can reliably detect it of course)
> 

You mean pg_combinebackup would have "--disable-checksums" switch? Yeah,
that'd work, I think. It's probably better than producing a backup that
would seem broken when the user tries to start the instance.

Also, I realized the user probably can't disable the checksums without
starting the instance to finish recovery. And if there are invalid
checksums, I'm not sure that would actually work.

> 
> I was thinking maybe we could detect this while taking the backups, and
>> force taking a full backup if checksums got enabled since the last
>> backup. But we can't do that because we only have the manifest from the
>> last backup, and the manifest does not include info about checksums.
>>
> 
> Can we forcibly read and parse it out of pg_control?
> 

You mean when taking the backup, or during pg_combinebackup?

During backup, it depends. For the instance we should be able to just
get that from the instance, no need to get it from pg_control. But for
the backup (used as a baseline for the increment) we can't read the
pg_control - the only thing we have is the manifest.

During pg_combinebackup we obviously can read pg_control for all the
backups to combine, but at that point it feels a bit too late - it does
not seem great to do backups, and then at recovery to tell the user the
backups are actually not valid.

I think it'd be better to detect this while taking the basebackup.

> 
> It's a bit unfortunate we don't have a way to enable checksums online.
>> That'd not have this problem IIRC, because it writes proper WAL. Maybe
>> it's time to revive that idea ... I recall there were some concerns
>> about tracking progress to allow resuming stuff, but maybe not having
>> anything because in some (rare?) cases it'd need to do more work does
>> not seem like a great trade off.
>>
>>
> For that one I still think it would be perfectly acceptable to have no
> resume at all, but that's a whole different topic :)
> 

I very much agree.

It seems a bit strange that given three options to enable checksums

1) offline
2) online without resume
3) online with resume

the initial argument was that we need to allow resuming the process
because on large systems it might take a lot of time, and we'd lose all
the work if the system restarts. But then we concluded that it's too
complex and it's better if the large systems have to do an extended
outage to enable checksums ...


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Add notes to pg_combinebackup docs

От
Magnus Hagander
Дата:
On Fri, Apr 12, 2024 at 3:01 PM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:


On 4/12/24 11:12, Magnus Hagander wrote:
> On Tue, Apr 9, 2024 at 11:46 AM Tomas Vondra <tomas.vondra@enterprisedb.com>
> wrote:
>
>>
>>
>> On 4/9/24 09:59, Martín Marqués wrote:
>>> Hello,
>>>
>>> While doing some work/research on the new incremental backup feature
>>> some limitations were not listed in the docs. Mainly the fact that
>>> pg_combienbackup works with plain format and not tar.
>>>
>>
>> Right. The docs mostly imply this by talking about output directory and
>> backup directories, but making it more explicit would not hurt.
>>
>> FWIW it'd be great if we could make incremental backups work with tar
>> format in the future too. People probably don't want to keep around the
>> expanded data directory or extract everything before combining the
>> backups is not very convenient. Reading and writing the tar would make
>> this simpler.
>>
>>> Around the same time, Tomas Vondra tested incremental backups with a
>>> cluster where he enabled checksums after taking the previous full
>>> backup. After combining the backups the synthetic backup had pages
>>> with checksums and other pages without checksums which ended in
>>> checksum errors.
>>>
>>
>> I'm not sure just documenting this limitation is sufficient. We can't
>> make the incremental backups work in this case (it's as if someone
>> messes with cluster without writing stuff into WAL), but I think we
>> should do better than silently producing (seemingly) corrupted backups
>
>
> +1. I think that should be an open item that needs to get sorted.
>
>
> I say seemingly, because the backup is actually fine, the only problem
>> is it has checksums enabled in the controlfile, but the pages from the
>> full backup (and the early incremental backups) have no checksums.
>>
>> What we could do is detect this in pg_combinebackup, and either just
>> disable checksums with a warning and hint to maybe enable them again. Or
>> maybe just print that the user needs to disable them.
>>
>
> I don't think either of these should be done automatically. Something like
> pg_combinebackup simply failing and requiring you to say
> "--disable-checksums" to have it do that would be the way to go, IMO.
> (once we can reliably detect it of course)
>

You mean pg_combinebackup would have "--disable-checksums" switch? Yeah,
that'd work, I think. It's probably better than producing a backup that
would seem broken when the user tries to start the instance.

Also, I realized the user probably can't disable the checksums without
starting the instance to finish recovery. And if there are invalid
checksums, I'm not sure that would actually work.

>
> I was thinking maybe we could detect this while taking the backups, and
>> force taking a full backup if checksums got enabled since the last
>> backup. But we can't do that because we only have the manifest from the
>> last backup, and the manifest does not include info about checksums.
>>
>
> Can we forcibly read and parse it out of pg_control?
>

You mean when taking the backup, or during pg_combinebackup?

Yes. That way combining the backups into something that doesn't have proper checksums (either by actually turning them off or as today just breaking them and forcing you to turn it off yourself) can only happen intentionally. And if you weren't aware of the problem, it turns into a hard error, so you will notice before it's too late.


During backup, it depends. For the instance we should be able to just
get that from the instance, no need to get it from pg_control. But for
the backup (used as a baseline for the increment) we can't read the
pg_control - the only thing we have is the manifest. 

During pg_combinebackup we obviously can read pg_control for all the
backups to combine, but at that point it feels a bit too late - it does
not seem great to do backups, and then at recovery to tell the user the
backups are actually not valid. 

I think it'd be better to detect this while taking the basebackup.

Agreed. In the end, we might want to do *both*, but the earlier the better.

But to do that what we'd need is to add a flag to the initial manifest that says "this cluster is supposed to have checksum = <on/off>" and then refuse to take an inc if it changes? It doesn't seem like the end of the world to add that to it?

 
> It's a bit unfortunate we don't have a way to enable checksums online.
>> That'd not have this problem IIRC, because it writes proper WAL. Maybe
>> it's time to revive that idea ... I recall there were some concerns
>> about tracking progress to allow resuming stuff, but maybe not having
>> anything because in some (rare?) cases it'd need to do more work does
>> not seem like a great trade off.
>>
>>
> For that one I still think it would be perfectly acceptable to have no
> resume at all, but that's a whole different topic :)
>

I very much agree.

It seems a bit strange that given three options to enable checksums

1) offline
2) online without resume
3) online with resume

the initial argument was that we need to allow resuming the process
because on large systems it might take a lot of time, and we'd lose all
the work if the system restarts. But then we concluded that it's too
complex and it's better if the large systems have to do an extended
outage to enable checksums ...

Indeed. Or the workaround that still scares the crap out of me where you use a switchover-and-switchback to a replica to "do the offline thing almost online". To me that seems a lot scarier than the original option as well. 

--

Re: Add notes to pg_combinebackup docs

От
David Steele
Дата:

On 4/12/24 22:40, Tomas Vondra wrote:
> On 4/12/24 11:50, David Steele wrote:
>> On 4/12/24 19:09, Magnus Hagander wrote:
>>> On Fri, Apr 12, 2024 at 12:14 AM David Steele <david@pgmasters.net
>>>
>>> ...>>
>>>       > But yeah, having to keep the backups as expanded directories is
>>> not
>>>       > great, I'd love to have .tar. Not necessarily because of the disk
>>>      space
>>>       > (in my experience the compression in filesystems works quite
>>> well for
>>>       > this purpose), but mostly because it's more compact and allows
>>>      working
>>>       > with backups as a single piece of data (e.g. it's much cleared
>>>      what the
>>>       > checksum of a single .tar is, compared to a directory).
>>>
>>>      But again, object stores are commonly used for backup these days and
>>>      billing is based on data stored rather than any compression that
>>> can be
>>>      done on the data. Of course, you'd want to store the compressed
>>> tars in
>>>      the object store, but that does mean storing an expanded copy
>>> somewhere
>>>      to do pg_combinebackup.
>>>
>>> Object stores are definitely getting more common. I wish they were
>>> getting a lot more common than they actually are, because they
>>> simplify a lot.  But they're in my experience still very far from
>>> being a majority.
>>
>> I see it the other way, especially the last few years. The majority seem
>> to be object stores followed up closely by NFS. Directly mounted storage
>> on the backup host appears to be rarer.
>>
> 
> One thing I'd mention is that not having built-in support for .tar and
> .tgz backups does not mean it's impossible to use pg_combinebackup with
> archives. You can mount them using e.g. "ratarmount" and then use that
> as source directories for pg_combinebackup.
> 
> It's not entirely friction-less because AFAICS it's necessary to do the
> backup in plain format and then do the .tar to have the expected "flat"
> directory structure (and not manifest + 2x tar). But other than that it
> seems to work fine (based on my limited testing).

Well, that's certainly convoluted and doesn't really help a lot in terms 
of space consumption, it just shifts the additional space required to 
the backup side. I doubt this is something we'd be willing to add to our 
documentation so it would be up to the user to figure out and script.

> FWIW the "archivemount" performs terribly, so adding this capability
> into pg_combinebackup is clearly far from trivial.

I imagine this would perform pretty badly. And yes, doing it efficiently 
is not trivial but certainly doable. Scanning the tar file and matching 
to entries in the manifest is one way, but I would prefer to store the 
offsets into the tar file in the manifest then assemble an ordered list 
of work to do on each tar file. But of course the latter requires a 
manifest-centric approach, which is not what we have right now.

Regards,
-David



Re: Add notes to pg_combinebackup docs

От
Robert Haas
Дата:
On Tue, Apr 9, 2024 at 4:00 AM Martín Marqués <martin.marques@gmail.com> wrote:
> I've attached two patches, the first one is just neat-picking things I
> found when I first read the docs.

I pushed a commit to remove the spurious "the" that you found, but I
don't really agree with the other changes. They all seem grammatically
correct, but I think they're grammatically correct now, too, and I
find it more readable the way it is.

I'll write a separate email about the checksum issue.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Add notes to pg_combinebackup docs

От
Robert Haas
Дата:
On Tue, Apr 9, 2024 at 5:46 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
> What we could do is detect this in pg_combinebackup, and either just
> disable checksums with a warning and hint to maybe enable them again. Or
> maybe just print that the user needs to disable them.
>
> I was thinking maybe we could detect this while taking the backups, and
> force taking a full backup if checksums got enabled since the last
> backup. But we can't do that because we only have the manifest from the
> last backup, and the manifest does not include info about checksums.

So, as I see it, we have at least five options here:

1. Document that this doesn't work. By "this doesn't work" I think
what we mean is: (A) If any of the backups you'd need to combine were
taken with checksums off but the last one was taken with checksums on,
then you might get checksum failures on the cluster written by
pg_combinebackup. (B) Therefore, after enabling checksums, you should
take a new full backup and make future incrementals dependent
thereupon. (C) But if you don't, then you can (I presume) run recovery
with ignore_checksum_failure=true to bring the cluster to a consistent
state, stop the database, shut checksums off, optionally also turn
them on again, and restart the database. Then presumably everything
should be OK, except that you'll have wiped out any real checksum
failures that weren't artifacts of the reconstruction process along
with the fake ones.

2. As (1), but make check_control_files() emit a warning message when
the problem case is detected.

3. As (2), but also add a command-line option to pg_combinebackup to
flip the checksum flag to false in the control file. Then, if you have
the problem case, instead of following the procedure described above,
you can just use this option, and enable checksums afterward if you
want. It still has the same disadvantage as the procedure described
above: any "real" checksum failures will be suppressed, too. From a
design perspective, this feels like kind of an ugly wart to me: hey,
in this one scenario, you have to add --do-something-random or it
doesn't work! But I see it's got some votes already, so maybe it's the
right answer.

4. Add the checksum state to the backup manifest. Then, if someone
tries to take an incremental backup with checksums on and the
precursor backup had checksums off, we could fail. A strength of this
proposal is that it actually stops the problem from happening at
backup time, which in general is a whole lot nicer than not noticing a
problem until restore time. A possible weakness is that it stops you
from doing something that is ... actually sort of OK. I mean, in the
strict sense, the incremental backup isn't valid, because it's going
to cause checksum failures after reconstruction, but it's valid apart
from the checksums, and those are fixable. I wonder whether users who
encounter this error message will say "oh, I'm glad PostgreSQL
prevented me from doing that" or "oh, I'm annoyed that PostgreSQL
prevented me from doing that."

5. At reconstruction time, notice which backups have checksums
enabled. If the final backup in the chain has them enabled, then
whenever we take a block from an earlier backup with checksums
disabled, re-checksum the block. As opposed to any of the previous
options, this creates a fully correct result, so there's no real need
to document any restrictions on what you're allowed to do. We might
need to document the performance consequences, though: fast file copy
methods will have to be disabled, and we'll have to go block by block
while paying the cost of checksum calculation for each one. You might
be sad to find out that your reconstruction is a lot slower than you
were expecting.

While I'm probably willing to implement any of these, I have some
reservations about attempting (4) or especially (5) after feature
freeze. I think there's a pretty decent chance that those fixes will
turn out to have issues of their own which we'll then need to fix in
turn. We could perhaps consider doing (2) for now and (5) for a future
release, or something like that.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Add notes to pg_combinebackup docs

От
Daniel Gustafsson
Дата:
> On 18 Apr 2024, at 20:11, Robert Haas <robertmhaas@gmail.com> wrote:

> 2. As (1), but make check_control_files() emit a warning message when
> the problem case is detected.

Being in the post-freeze part of the cycle, this seems like the best option.

> 3. As (2), but also add a command-line option to pg_combinebackup to
> flip the checksum flag to false in the control file.

I don't think this is the way to go, such an option will be plastered on to
helpful tutorials which users will copy/paste from even when they don't need
the option at all, only to disable checksums when there was no reason for doing
so.

> We could perhaps consider doing (2) for now and (5) for a future
> release, or something like that.

+1

--
Daniel Gustafsson




Re: Add notes to pg_combinebackup docs

От
Robert Haas
Дата:
On Thu, Apr 18, 2024 at 3:25 PM Daniel Gustafsson <daniel@yesql.se> wrote:
> I don't think this is the way to go, such an option will be plastered on to
> helpful tutorials which users will copy/paste from even when they don't need
> the option at all, only to disable checksums when there was no reason for doing
> so.

That's actually a really good point. I mean, it's such an obscure
scenario, maybe it wouldn't make it into the tutorials, but if it
does, it'll never get taken out.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Add notes to pg_combinebackup docs

От
Robert Haas
Дата:
On Thu, Apr 18, 2024 at 3:25 PM Daniel Gustafsson <daniel@yesql.se> wrote:
> > On 18 Apr 2024, at 20:11, Robert Haas <robertmhaas@gmail.com> wrote:
> > 2. As (1), but make check_control_files() emit a warning message when
> > the problem case is detected.
>
> Being in the post-freeze part of the cycle, this seems like the best option.

Here is a patch for that.

--
Robert Haas
EDB: http://www.enterprisedb.com

Вложения

Re: Add notes to pg_combinebackup docs

От
Robert Haas
Дата:
On Mon, Apr 22, 2024 at 2:52 PM Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Apr 18, 2024 at 3:25 PM Daniel Gustafsson <daniel@yesql.se> wrote:
> > > On 18 Apr 2024, at 20:11, Robert Haas <robertmhaas@gmail.com> wrote:
> > > 2. As (1), but make check_control_files() emit a warning message when
> > > the problem case is detected.
> >
> > Being in the post-freeze part of the cycle, this seems like the best option.
>
> Here is a patch for that.

Is someone willing to review this patch? cfbot seems happy enough with
it, and I'd like to get it committed since it's an open item, but I'd
be happy to have some review before I do that.

If nobody reviews or indicates an intent to do so in the next couple
of days, I'll go ahead and commit this on my own.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Add notes to pg_combinebackup docs

От
Daniel Gustafsson
Дата:
> On 22 Apr 2024, at 20:52, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Thu, Apr 18, 2024 at 3:25 PM Daniel Gustafsson <daniel@yesql.se> wrote:
>>> On 18 Apr 2024, at 20:11, Robert Haas <robertmhaas@gmail.com> wrote:
>>> 2. As (1), but make check_control_files() emit a warning message when
>>> the problem case is detected.
>>
>> Being in the post-freeze part of the cycle, this seems like the best option.
>
> Here is a patch for that.

LGTM; only one small comment which you can ignore if you feel it's not worth
the extra words.

+    <literal>pg_combinebackup</literal> when the checksum status of the
+    cluster has been changed; see

I would have preferred that this sentence included the problematic period for
the change, perhaps "..has been changed after the initial backup." or ideally
something even better.  In other words, clarifying that if checksums were
enabled before any backups were taken this limitation is not in play.  It's not
critical as the link aptly documents this, it just seems like the sentence is
cut short.

--
Daniel Gustafsson




Re: Add notes to pg_combinebackup docs

От
Robert Haas
Дата:
On Wed, Apr 24, 2024 at 3:08 PM Daniel Gustafsson <daniel@yesql.se> wrote:
> LGTM; only one small comment which you can ignore if you feel it's not worth
> the extra words.
>
> +    <literal>pg_combinebackup</literal> when the checksum status of the
> +    cluster has been changed; see
>
> I would have preferred that this sentence included the problematic period for
> the change, perhaps "..has been changed after the initial backup." or ideally
> something even better.  In other words, clarifying that if checksums were
> enabled before any backups were taken this limitation is not in play.  It's not
> critical as the link aptly documents this, it just seems like the sentence is
> cut short.

This was somewhat deliberate. The phraseology that you propose doesn't
exactly seem incorrect to me. However, consider the scenario where
someone takes a full backup A, an incremental backup B based on A, and
another incremental backup C based on B. Then, they combine A with B
to produce X, remove A and B, and later combine X with C. When we talk
about the "initial backup", are we talking about A or X? It doesn't
quite matter, in the end, because if X has a problem it must be
because A had a similar problem, and it's also sort of meaningless to
talk about when X was taken, because it wasn't ever taken from the
origin server; it was reconstructed. And it doesn't matter what was
happening on the origin server at the time it was reconstructed, but
rather what was happening on the origin server at the time its inputs
were taken. Or, in the case of its first input, that could also be a
reconstruction, in which case the time of that reconstruction doesn't
matter either; there can be any number of levels here.

I feel that all of this makes it a little murky to talk about what
happened after the "initial backup". The actual original full backup
need not even exist any more at the time of reconstruction, as in the
example above. Now, I think the user will probably still get the
point, but I also think they'll probably get the point without the
extra verbiage. I think that it will be natural for people to imagine
that what matters is not whether the checksum status has ever changed,
but whether it has changed within the relevant time period, whatever
that is exactly. If they have a more complicated situation where it's
hard to reason about what the relevant time period is, my hope is that
they'll click on the link and that the longer text they see on the
other end will help them think the situation through.

Again, this is not to say that what you're proposing is necessarily
wrong; I'm just explaining my own thinking.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Add notes to pg_combinebackup docs

От
Daniel Gustafsson
Дата:
> On 25 Apr 2024, at 18:16, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Wed, Apr 24, 2024 at 3:08 PM Daniel Gustafsson <daniel@yesql.se> wrote:
>> LGTM; only one small comment which you can ignore if you feel it's not worth
>> the extra words.
>>
>> +    <literal>pg_combinebackup</literal> when the checksum status of the
>> +    cluster has been changed; see
>>
>> I would have preferred that this sentence included the problematic period for
>> the change, perhaps "..has been changed after the initial backup." or ideally
>> something even better.  In other words, clarifying that if checksums were
>> enabled before any backups were taken this limitation is not in play.  It's not
>> critical as the link aptly documents this, it just seems like the sentence is
>> cut short.
>
> This was somewhat deliberate.

Given your reasoning below, +1 on the patch you proposed.

> I think the user will probably still get the
> point, but I also think they'll probably get the point without the
> extra verbiage. I think that it will be natural for people to imagine
> that what matters is not whether the checksum status has ever changed,
> but whether it has changed within the relevant time period, whatever
> that is exactly.

Fair enough.

> Again, this is not to say that what you're proposing is necessarily
> wrong; I'm just explaining my own thinking.

Gotcha, much appreciated.

--
Daniel Gustafsson