Обсуждение: pg_basebackup --wal-method=fetch
(PG 14, if it matters.)
What's the purpose of fetch mode, as opposed to streaming mode? Is it a legacy of bygone days that just hasn't been deprecated, or is there something I don't understand from reading https://www.postgresql.org/docs/14/app-pgbasebackup.html?
On Thu, Feb 8, 2024, 17:05 Ron Johnson <ronljohnsonjr@gmail.com> wrote:
(PG 14, if it matters.)What's the purpose of fetch mode, as opposed to streaming mode? Is it a legacy of bygone days that just hasn't been deprecated, or is there something I don't understand from reading https://www.postgresql.org/docs/14/app-pgbasebackup.html?
Your backup can go to a single file with it, which it can't do in streaming. Which means it can also be sent through a pipe.
It also needs one connection instead of two to the server, if that's limited.
/Magnus
On Thu, Feb 8, 2024 at 12:48 PM Magnus Hagander <magnus@hagander.net> wrote:
On Thu, Feb 8, 2024, 17:05 Ron Johnson <ronljohnsonjr@gmail.com> wrote:(PG 14, if it matters.)What's the purpose of fetch mode, as opposed to streaming mode? Is it a legacy of bygone days that just hasn't been deprecated, or is there something I don't understand from reading https://www.postgresql.org/docs/14/app-pgbasebackup.html?Your backup can go to a single file with it, which it can't do in streaming. Which means it can also be sent through a pipe.
But isn't the whole purpose of pg_basebackup (running it on Node B, when the database instance is Node A)?
It also needs one connection instead of two to the server, if that's limited.
It's 2024, not 2011. Who can't spare an extra connection?
Greetings, * Ron Johnson (ronljohnsonjr@gmail.com) wrote: > On Thu, Feb 8, 2024 at 12:48 PM Magnus Hagander <magnus@hagander.net> wrote: > > On Thu, Feb 8, 2024, 17:05 Ron Johnson <ronljohnsonjr@gmail.com> wrote: > >> (PG 14, if it matters.) > >> > >> What's the purpose of fetch mode, as opposed to streaming mode? Is it a > >> legacy of bygone days that just hasn't been deprecated, or is there > >> something I don't understand from reading > >> https://www.postgresql.org/docs/14/app-pgbasebackup.html? > > > > Your backup can go to a single file with it, which it can't do in > > streaming. Which means it can also be sent through a pipe. > > But isn't the whole purpose of pg_basebackup (running it on Node B, when > the database instance is Node A)? Something seems missing from this question? Being able to send through a pipe might allow someone to send directly to a tape device or to a Bacula system or similar. > > It also needs one connection instead of two to the server, if that's > > limited. > > It's 2024, not 2011. Who can't spare an extra connection? Changing max_wal_senders requires a database-wide restart, so.. Not sure where you're going with this though. Are you arguing that fetch mode should be removed? If so, why? If that's not the angle, then what is? Would you suggest some better documentation of the option? I'm sure a proposal to improve the docs would be welcome, if there's something confusing about them and this option. Thanks! Stephen
Вложения
On Thu, Feb 8, 2024 at 4:41 PM Stephen Frost <sfrost@snowman.net> wrote:
Greetings,
* Ron Johnson (ronljohnsonjr@gmail.com) wrote:
> On Thu, Feb 8, 2024 at 12:48 PM Magnus Hagander <magnus@hagander.net> wrote:
> > On Thu, Feb 8, 2024, 17:05 Ron Johnson <ronljohnsonjr@gmail.com> wrote:
> >> (PG 14, if it matters.)
> >>
> >> What's the purpose of fetch mode, as opposed to streaming mode? Is it a
> >> legacy of bygone days that just hasn't been deprecated, or is there
> >> something I don't understand from reading
> >> https://www.postgresql.org/docs/14/app-pgbasebackup.html?
> >
> > Your backup can go to a single file with it, which it can't do in
> > streaming. Which means it can also be sent through a pipe.
>
> But isn't the whole purpose of pg_basebackup (running it on Node B, when
> the database instance is Node A)?
Something seems missing from this question?
The word "streaming".
Should be "But isn't streaming the whole purpose of pg_basebackup"?
Being able to send through a pipe might allow someone to send directly
to a tape device or to a Bacula system or similar.
Yeah, ok.
I use PgBackRest, though, and can't imagine single-threading any reasonably-sized database. In fact, one of the tasks on my mental TODO list is to research how to use PgBackRest to initialize a replica instance prior to starting Streaming Replication.
> > It also needs one connection instead of two to the server, if that's
> > limited.
>
> It's 2024, not 2011. Who can't spare an extra connection?
Changing max_wal_senders requires a database-wide restart, so..
To not have some wiggle room is poor planning
Not sure where you're going with this though. Are you arguing that
fetch mode should be removed?
No. Just curious about its use cases.
If so, why? If that's not the angle,
then what is? Would you suggest some better documentation of the
option? I'm sure a proposal to improve the docs would be welcome, if
there's something confusing about them and this option.
A hint as to the use-case for the non-default "streaming" option would be enlightening.
Greetings, * Ron Johnson (ronljohnsonjr@gmail.com) wrote: > On Thu, Feb 8, 2024 at 4:41 PM Stephen Frost <sfrost@snowman.net> wrote: > > * Ron Johnson (ronljohnsonjr@gmail.com) wrote: > > > On Thu, Feb 8, 2024 at 12:48 PM Magnus Hagander <magnus@hagander.net> > > wrote: > > > > On Thu, Feb 8, 2024, 17:05 Ron Johnson <ronljohnsonjr@gmail.com> > > wrote: > > > >> (PG 14, if it matters.) > > > >> > > > >> What's the purpose of fetch mode, as opposed to streaming mode? Is > > it a > > > >> legacy of bygone days that just hasn't been deprecated, or is there > > > >> something I don't understand from reading > > > >> https://www.postgresql.org/docs/14/app-pgbasebackup.html? > > > > > > > > Your backup can go to a single file with it, which it can't do in > > > > streaming. Which means it can also be sent through a pipe. > > > > > > But isn't the whole purpose of pg_basebackup (running it on Node B, when > > > the database instance is Node A)? > > > > Something seems missing from this question? > > The word "streaming". > Should be "But isn't streaming the whole purpose of pg_basebackup"? I'm a bit confused on this point still as if the whole purpose of pg_basebackup is to be streaming ... then we should be defaulting to fetch mode still? > I use PgBackRest, though, and can't imagine single-threading any > reasonably-sized database. In fact, one of the tasks on my mental TODO > list is to research how to use PgBackRest to initialize a replica instance > prior to starting Streaming Replication. I use pgbackrest too. ;) That said, one of the longest poles in the tent when it comes to dealing with backups is compression- and there are tools like pigz that allow you to multi-thread a single stream across many cores, so it isn't necessarily the case that using fetch mode or pg_basebackup generally makes everything have to be completely single-process. Of course, pgbackrest has a lot of other features that make it a great tool to use. In terms of using pgbackrest to initialize a replica ... that's basically running 'pgbackrest restore --type=standby'? There's really not much more to it than that. pgbackrest will set up the restored system to replay from the WAL in the archive, you'd just need to configure primary_conninfo so that the replica will attempt to connect to the primary once it's caught up with all of the WAL in the archive. > > > It also needs one connection instead of two to the server, if that's > > > > limited. > > > > > > It's 2024, not 2011. Who can't spare an extra connection? > > > > Changing max_wal_senders requires a database-wide restart, so.. > > To not have some wiggle room is poor planning Doesn't change reality though. > > If so, why? If that's not the angle, > > then what is? Would you suggest some better documentation of the > > option? I'm sure a proposal to improve the docs would be welcome, if > > there's something confusing about them and this option. > > A hint as to the use-case for the non-default "streaming" option would be > enlightening. I mentioned a couple of them above and there is an example of streaming in the documentation today: ###### To create a backup of a single-tablespace local database and compress this with bzip2: $ pg_basebackup -D - -Ft -X fetch | bzip2 > backup.tar.bz2 ###### Would it make things more clear if this was an example that sent data to a tape device instead of through bzip2..? Thanks! Stephen
Вложения
On Thu, Feb 8, 2024 at 5:21 PM Stephen Frost <sfrost@snowman.net> wrote:
Greetings,
* Ron Johnson (ronljohnsonjr@gmail.com) wrote:
> The word "streaming".
> Should be "But isn't streaming the whole purpose of pg_basebackup"?
I'm a bit confused on this point still as if the whole purpose of
pg_basebackup is to be streaming ... then we should be defaulting to
fetch mode still?
No. Since I thought streaming is the whole purpose of pg_basebackup, I questioned the utility of every other method except --wal-method=streaming.
> I use PgBackRest, though, and can't imagine single-threading any
> reasonably-sized database. In fact, one of the tasks on my mental TODO
> list is to research how to use PgBackRest to initialize a replica instance
> prior to starting Streaming Replication.
[snip]
In terms of using pgbackrest to initialize a replica ... that's
basically running 'pgbackrest restore --type=standby'? There's really
not much more to it than that. pgbackrest will set up the restored
system to replay from the WAL in the archive, you'd just need to
configure primary_conninfo so that the replica will attempt to connect
to the primary once it's caught up with all of the WAL in the archive.
I haven't examined it closely enough. What little looking that I did made me wonder whether PgBackRest handled all the replication itself, or whether it just initialized everything and then let physical replication using replication slots take over.
On Fri, Feb 9, 2024 at 12:11 AM Ron Johnson <ronljohnsonjr@gmail.com> wrote: > > On Thu, Feb 8, 2024 at 5:21 PM Stephen Frost <sfrost@snowman.net> wrote: >> >> Greetings, >> >> * Ron Johnson (ronljohnsonjr@gmail.com) wrote: >> >> > The word "streaming". >> > Should be "But isn't streaming the whole purpose of pg_basebackup"? >> >> I'm a bit confused on this point still as if the whole purpose of >> pg_basebackup is to be streaming ... then we should be defaulting to >> fetch mode still? > > > No. Since I thought streaming is the whole purpose of pg_basebackup, I questioned the utility of every other method except--wal-method=streaming. Well, "fetch" mode also does streaming... It just does streaming at the end instead of parallel. So I guess one could argue the names chosen are wrong, and should be named something like "parallel" and "sequential". But I think the ship on renaming that has sailed a long time ago... -- Magnus Hagander Me: https://www.hagander.net/ Work: https://www.redpill-linpro.com/