Обсуждение: pg_basebackup --wal-method=fetch

Поиск
Список
Период
Сортировка

pg_basebackup --wal-method=fetch

От
Ron Johnson
Дата:
(PG 14, if it matters.)

What's the purpose of fetch mode, as opposed to streaming mode?  Is it a legacy of bygone days that just hasn't been deprecated, or is there something I don't understand from reading https://www.postgresql.org/docs/14/app-pgbasebackup.html?

Re: pg_basebackup --wal-method=fetch

От
Magnus Hagander
Дата:


On Thu, Feb 8, 2024, 17:05 Ron Johnson <ronljohnsonjr@gmail.com> wrote:
(PG 14, if it matters.)

What's the purpose of fetch mode, as opposed to streaming mode?  Is it a legacy of bygone days that just hasn't been deprecated, or is there something I don't understand from reading https://www.postgresql.org/docs/14/app-pgbasebackup.html?



Your backup can go to a single file with it, which it can't do in streaming. Which means it can also be sent through a pipe. 

It also needs one connection instead of two to the server, if that's limited. 

/Magnus 

Re: pg_basebackup --wal-method=fetch

От
Ron Johnson
Дата:
On Thu, Feb 8, 2024 at 12:48 PM Magnus Hagander <magnus@hagander.net> wrote:


On Thu, Feb 8, 2024, 17:05 Ron Johnson <ronljohnsonjr@gmail.com> wrote:
(PG 14, if it matters.)

What's the purpose of fetch mode, as opposed to streaming mode?  Is it a legacy of bygone days that just hasn't been deprecated, or is there something I don't understand from reading https://www.postgresql.org/docs/14/app-pgbasebackup.html?



Your backup can go to a single file with it, which it can't do in streaming. Which means it can also be sent through a pipe. 

But isn't the whole purpose of pg_basebackup (running it on Node B, when the database instance is Node A)?
 
It also needs one connection instead of two to the server, if that's limited. 

It's 2024, not 2011.  Who can't spare an extra connection?
 

Re: pg_basebackup --wal-method=fetch

От
Stephen Frost
Дата:
Greetings,

* Ron Johnson (ronljohnsonjr@gmail.com) wrote:
> On Thu, Feb 8, 2024 at 12:48 PM Magnus Hagander <magnus@hagander.net> wrote:
> > On Thu, Feb 8, 2024, 17:05 Ron Johnson <ronljohnsonjr@gmail.com> wrote:
> >> (PG 14, if it matters.)
> >>
> >> What's the purpose of fetch mode, as opposed to streaming mode?  Is it a
> >> legacy of bygone days that just hasn't been deprecated, or is there
> >> something I don't understand from reading
> >> https://www.postgresql.org/docs/14/app-pgbasebackup.html?
> >
> > Your backup can go to a single file with it, which it can't do in
> > streaming. Which means it can also be sent through a pipe.
>
> But isn't the whole purpose of pg_basebackup (running it on Node B, when
> the database instance is Node A)?

Something seems missing from this question?

Being able to send through a pipe might allow someone to send directly
to a tape device or to a Bacula system or similar.

> > It also needs one connection instead of two to the server, if that's
> > limited.
>
> It's 2024, not 2011.  Who can't spare an extra connection?

Changing max_wal_senders requires a database-wide restart, so..

Not sure where you're going with this though.  Are you arguing that
fetch mode should be removed?  If so, why?  If that's not the angle,
then what is?  Would you suggest some better documentation of the
option?  I'm sure a proposal to improve the docs would be welcome, if
there's something confusing about them and this option.

Thanks!

Stephen

Вложения

Re: pg_basebackup --wal-method=fetch

От
Ron Johnson
Дата:
On Thu, Feb 8, 2024 at 4:41 PM Stephen Frost <sfrost@snowman.net> wrote:
Greetings,

* Ron Johnson (ronljohnsonjr@gmail.com) wrote:
> On Thu, Feb 8, 2024 at 12:48 PM Magnus Hagander <magnus@hagander.net> wrote:
> > On Thu, Feb 8, 2024, 17:05 Ron Johnson <ronljohnsonjr@gmail.com> wrote:
> >> (PG 14, if it matters.)
> >>
> >> What's the purpose of fetch mode, as opposed to streaming mode?  Is it a
> >> legacy of bygone days that just hasn't been deprecated, or is there
> >> something I don't understand from reading
> >> https://www.postgresql.org/docs/14/app-pgbasebackup.html?
> >
> > Your backup can go to a single file with it, which it can't do in
> > streaming. Which means it can also be sent through a pipe.
>
> But isn't the whole purpose of pg_basebackup (running it on Node B, when
> the database instance is Node A)?

Something seems missing from this question?

The word "streaming".
Should be "But isn't streaming the whole purpose of pg_basebackup"?
 
Being able to send through a pipe might allow someone to send directly
to a tape device or to a Bacula system or similar.

Yeah, ok.
 
I use PgBackRest, though, and can't imagine single-threading any reasonably-sized database.  In fact, one of the tasks on my mental TODO list is to research how to use PgBackRest to initialize a replica instance prior to starting Streaming Replication.

> > It also needs one connection instead of two to the server, if that's
> > limited.
>
> It's 2024, not 2011.  Who can't spare an extra connection?

Changing max_wal_senders requires a database-wide restart, so..

To not have some wiggle room is poor planning
 
Not sure where you're going with this though.  Are you arguing that
fetch mode should be removed? 

No.  Just curious about its use cases.
 
If so, why?  If that's not the angle,
then what is?  Would you suggest some better documentation of the
option?  I'm sure a proposal to improve the docs would be welcome, if
there's something confusing about them and this option.
 
A hint as to the use-case for the non-default "streaming" option would be enlightening.

Re: pg_basebackup --wal-method=fetch

От
Stephen Frost
Дата:
Greetings,

* Ron Johnson (ronljohnsonjr@gmail.com) wrote:
> On Thu, Feb 8, 2024 at 4:41 PM Stephen Frost <sfrost@snowman.net> wrote:
> > * Ron Johnson (ronljohnsonjr@gmail.com) wrote:
> > > On Thu, Feb 8, 2024 at 12:48 PM Magnus Hagander <magnus@hagander.net>
> > wrote:
> > > > On Thu, Feb 8, 2024, 17:05 Ron Johnson <ronljohnsonjr@gmail.com>
> > wrote:
> > > >> (PG 14, if it matters.)
> > > >>
> > > >> What's the purpose of fetch mode, as opposed to streaming mode?  Is
> > it a
> > > >> legacy of bygone days that just hasn't been deprecated, or is there
> > > >> something I don't understand from reading
> > > >> https://www.postgresql.org/docs/14/app-pgbasebackup.html?
> > > >
> > > > Your backup can go to a single file with it, which it can't do in
> > > > streaming. Which means it can also be sent through a pipe.
> > >
> > > But isn't the whole purpose of pg_basebackup (running it on Node B, when
> > > the database instance is Node A)?
> >
> > Something seems missing from this question?
>
> The word "streaming".
> Should be "But isn't streaming the whole purpose of pg_basebackup"?

I'm a bit confused on this point still as if the whole purpose of
pg_basebackup is to be streaming ... then we should be defaulting to
fetch mode still?

> I use PgBackRest, though, and can't imagine single-threading any
> reasonably-sized database.  In fact, one of the tasks on my mental TODO
> list is to research how to use PgBackRest to initialize a replica instance
> prior to starting Streaming Replication.

I use pgbackrest too. ;)  That said, one of the longest poles in the
tent when it comes to dealing with backups is compression- and there are
tools like pigz that allow you to multi-thread a single stream across
many cores, so it isn't necessarily the case that using fetch mode or
pg_basebackup generally makes everything have to be completely
single-process.  Of course, pgbackrest has a lot of other features that
make it a great tool to use.

In terms of using pgbackrest to initialize a replica ... that's
basically running 'pgbackrest restore --type=standby'?  There's really
not much more to it than that.  pgbackrest will set up the restored
system to replay from the WAL in the archive, you'd just need to
configure primary_conninfo so that the replica will attempt to connect
to the primary once it's caught up with all of the WAL in the archive.

> > > It also needs one connection instead of two to the server, if that's
> > > > limited.
> > >
> > > It's 2024, not 2011.  Who can't spare an extra connection?
> >
> > Changing max_wal_senders requires a database-wide restart, so..
>
> To not have some wiggle room is poor planning

Doesn't change reality though.

> > If so, why?  If that's not the angle,
> > then what is?  Would you suggest some better documentation of the
> > option?  I'm sure a proposal to improve the docs would be welcome, if
> > there's something confusing about them and this option.
>
> A hint as to the use-case for the non-default "streaming" option would be
> enlightening.

I mentioned a couple of them above and there is an example of streaming
in the documentation today:

######
To create a backup of a single-tablespace local database and compress
this with bzip2:

$ pg_basebackup -D - -Ft -X fetch | bzip2 > backup.tar.bz2
######

Would it make things more clear if this was an example that sent data to
a tape device instead of through bzip2..?

Thanks!

Stephen

Вложения

Re: pg_basebackup --wal-method=fetch

От
Ron Johnson
Дата:
On Thu, Feb 8, 2024 at 5:21 PM Stephen Frost <sfrost@snowman.net> wrote:
Greetings,

* Ron Johnson (ronljohnsonjr@gmail.com) wrote:

> The word "streaming".
> Should be "But isn't streaming the whole purpose of pg_basebackup"?

I'm a bit confused on this point still as if the whole purpose of
pg_basebackup is to be streaming ... then we should be defaulting to
fetch mode still?

No.  Since I thought streaming is the whole purpose of pg_basebackup, I questioned the utility of every other method except --wal-method=streaming.
 
> I use PgBackRest, though, and can't imagine single-threading any
> reasonably-sized database.  In fact, one of the tasks on my mental TODO
> list is to research how to use PgBackRest to initialize a replica instance
> prior to starting Streaming Replication.

[snip] 
In terms of using pgbackrest to initialize a replica ... that's
basically running 'pgbackrest restore --type=standby'?  There's really
not much more to it than that.  pgbackrest will set up the restored
system to replay from the WAL in the archive, you'd just need to
configure primary_conninfo so that the replica will attempt to connect
to the primary once it's caught up with all of the WAL in the archive.

I haven't examined it closely enough.  What little looking that I did made me wonder whether PgBackRest handled all the replication itself, or whether it just initialized everything and then let physical replication using replication slots take over.

Re: pg_basebackup --wal-method=fetch

От
Magnus Hagander
Дата:
On Fri, Feb 9, 2024 at 12:11 AM Ron Johnson <ronljohnsonjr@gmail.com> wrote:
>
> On Thu, Feb 8, 2024 at 5:21 PM Stephen Frost <sfrost@snowman.net> wrote:
>>
>> Greetings,
>>
>> * Ron Johnson (ronljohnsonjr@gmail.com) wrote:
>>
>> > The word "streaming".
>> > Should be "But isn't streaming the whole purpose of pg_basebackup"?
>>
>> I'm a bit confused on this point still as if the whole purpose of
>> pg_basebackup is to be streaming ... then we should be defaulting to
>> fetch mode still?
>
>
> No.  Since I thought streaming is the whole purpose of pg_basebackup, I questioned the utility of every other method
except--wal-method=streaming. 

Well, "fetch" mode also does streaming... It just does streaming at
the end instead of parallel. So I guess one could argue the names
chosen are wrong, and should be named something like "parallel" and
"sequential". But I think the ship on renaming that has sailed a long
time ago...

--
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/