Обсуждение: Improve documentation for pg_upgrade, standbys and rsync

Поиск
Список
Период
Сортировка

Improve documentation for pg_upgrade, standbys and rsync

От
Laurenz Albe
Дата:
I revently tried to upgrade a standby following the documentation,
but I found it hard to understand, and it took me several tries to
get it right.  This is of course owing to my lack of expertise with
rsync, but I think the documentation and examples could be clearer.

I think it would be a good idea to recommend the --relative option
of rsync.

Here is a patch that does that, as well as update the versions in
the code samples to something more recent.  Also, I think it makes
sense to place the data directory in the sample in /var/lib/postgresql,
which is similar to what many people will have in real life.

Yours,
Laurenz Albe

Вложения

Re: Improve documentation for pg_upgrade, standbys and rsync

От
Stephen Frost
Дата:
Greetings,

* Laurenz Albe (laurenz.albe@cybertec.at) wrote:
> I revently tried to upgrade a standby following the documentation,
> but I found it hard to understand, and it took me several tries to
> get it right.  This is of course owing to my lack of expertise with
> rsync, but I think the documentation and examples could be clearer.
>
> I think it would be a good idea to recommend the --relative option
> of rsync.
>
> Here is a patch that does that, as well as update the versions in
> the code samples to something more recent.  Also, I think it makes
> sense to place the data directory in the sample in /var/lib/postgresql,
> which is similar to what many people will have in real life.

Haven't had a chance to look at this in depth but improving things here
would be good.

An additional thing that we should really be mentioning is to tell
people to go in and TRUNCATE all of their UNLOGGED tables before going
through this process, otherwise the rsync will end up spending a bunch
of time copying the files for UNLOGGED relations which you really don't
want.

Thanks,

Stephen

Вложения

Re: Improve documentation for pg_upgrade, standbys and rsync

От
Laurenz Albe
Дата:
On Wed, 2021-05-19 at 10:31 -0400, Stephen Frost wrote:
> * Laurenz Albe (laurenz.albe@cybertec.at) wrote:
> > I revently tried to upgrade a standby following the documentation,
> > but I found it hard to understand, [...]
>
> Haven't had a chance to look at this in depth but improving things here
> would be good.
> 
> An additional thing that we should really be mentioning is to tell
> people to go in and TRUNCATE all of their UNLOGGED tables before going
> through this process, otherwise the rsync will end up spending a bunch
> of time copying the files for UNLOGGED relations which you really don't
> want.

Thanks for the feedback and the suggestion.
CCing -hackers so that I can add it to the commitfest.

Yours,
Laurenz Albe




Re: Improve documentation for pg_upgrade, standbys and rsync

От
Laurenz Albe
Дата:
On Wed, 2021-05-19 at 10:31 -0400, Stephen Frost wrote:
> * Laurenz Albe (laurenz.albe@cybertec.at) wrote:
> > I revently tried to upgrade a standby following the documentation,
> > but I found it hard to understand, [...]
>
> Haven't had a chance to look at this in depth but improving things here
> would be good.
> 
> An additional thing that we should really be mentioning is to tell
> people to go in and TRUNCATE all of their UNLOGGED tables before going
> through this process, otherwise the rsync will end up spending a bunch
> of time copying the files for UNLOGGED relations which you really don't
> want.

Thanks for the feedback and the suggestion.
CCing -hackers so that I can add it to the commitfest.

Yours,
Laurenz Albe




Re: Improve documentation for pg_upgrade, standbys and rsync

От
Laurenz Albe
Дата:
On Wed, 2021-05-19 at 10:31 -0400, Stephen Frost wrote:
> * Laurenz Albe (laurenz.albe@cybertec.at) wrote:
> > I revently tried to upgrade a standby following the documentation,
> > but I found it hard to understand, and it took me several tries to
> > get it right.  This is of course owing to my lack of expertise with
> > rsync, but I think the documentation and examples could be clearer.
> > 
> > I think it would be a good idea to recommend the --relative option
> > of rsync.
> 
> An additional thing that we should really be mentioning is to tell
> people to go in and TRUNCATE all of their UNLOGGED tables before going
> through this process, otherwise the rsync will end up spending a bunch
> of time copying the files for UNLOGGED relations which you really don't
> want.

I have thought about that some more, and I am not certain that we should
unconditionally recommend that.  Perhaps the pain of rebuilding the
unlogged table on the primary would be worse than rsyncing it to the
standby.

The documentation already mentions

  "Unfortunately, rsync needlessly copies files associated with temporary
   and unlogged tables because these files don't normally exist on standby
   servers."

I'd say that is good enough, and people can draw their conclusions from
that.

Attached is a new patch with an added reminder to create "standby.signal",
as mentioned in [1].

Yours,
Laurenz Albe

 [1]: https://www.postgr.es/m/1A5A1B6E-7BB6-47EB-8443-40222B769404@iris.washington.edu

Вложения

Re: Improve documentation for pg_upgrade, standbys and rsync

От
Stephen Frost
Дата:
Greetings,

* Laurenz Albe (laurenz.albe@cybertec.at) wrote:
> On Wed, 2021-05-19 at 10:31 -0400, Stephen Frost wrote:
> > * Laurenz Albe (laurenz.albe@cybertec.at) wrote:
> > > I revently tried to upgrade a standby following the documentation,
> > > but I found it hard to understand, and it took me several tries to
> > > get it right.  This is of course owing to my lack of expertise with
> > > rsync, but I think the documentation and examples could be clearer.
> > >
> > > I think it would be a good idea to recommend the --relative option
> > > of rsync.
> >
> > An additional thing that we should really be mentioning is to tell
> > people to go in and TRUNCATE all of their UNLOGGED tables before going
> > through this process, otherwise the rsync will end up spending a bunch
> > of time copying the files for UNLOGGED relations which you really don't
> > want.
>
> I have thought about that some more, and I am not certain that we should
> unconditionally recommend that.  Perhaps the pain of rebuilding the
> unlogged table on the primary would be worse than rsyncing it to the
> standby.

I disagree entirely.  The reason to even consider using this approach is
to minimize the time required to get things back online and there's no
question that having the unlogged tables get rsync'd across would
increase the time required.

> The documentation already mentions
>
>   "Unfortunately, rsync needlessly copies files associated with temporary
>    and unlogged tables because these files don't normally exist on standby
>    servers."
>
> I'd say that is good enough, and people can draw their conclusions from
> that.

I disagree.  Instead, we should have explicit steps included which
detail how to find and truncate unlogged tables and what to do to remove
or exclude temporary files once the server is shut down.

> Attached is a new patch with an added reminder to create "standby.signal",
> as mentioned in [1].
>
> Yours,
> Laurenz Albe
>
>  [1]: https://www.postgr.es/m/1A5A1B6E-7BB6-47EB-8443-40222B769404@iris.washington.edu

> From 47b685b700548af06ab08673187bdd1df7236464 Mon Sep 17 00:00:00 2001
> From: Laurenz Albe <laurenz.albe@cybertec.at>
> Date: Fri, 16 Jul 2021 07:45:22 +0200
> Subject: [PATCH] Improve doc for pg_upgrade and standby servers
>
> Recommend using the --relative option of rsync for clarity
> and adapt the code samples accordingly.
> Using relative paths makes clearer what is meant by "current
> directory" and "remote_dir".

I'm not really convinced that this is actually a positive change, though
I don't know that it's really a negative one either.  In general, I
prefer fully qualified paths to try and make things very clear about
what's happening, but this is also a bit of an odd case due to hard
links, etc.

> Add a reminder that "standby.signal" needs to be created.

This makes sense to include, certainly, but it should be an explicit
step, not just a "don't forget" note at the end.  I'm not really sure
why we talk about "log shipping" either..?  Wouldn't it make more sense
to have something like:

g. Configure standby servers

Review the prior configuration of the standby servers and set up the
same configuration in the newly rsync'd directory.

1. touch /path/to/replica/standby.signal
2. Configure restore_command to pull from WAL archive
3. For streaming replicas, configure primary_conninfo

Probably back-patched all the way, with adjustments made for the pre-12
releases accordingly.

Thanks,

Stephen

Вложения

Re: Improve documentation for pg_upgrade, standbys and rsync

От
Laurenz Albe
Дата:
Thanks for looking at this!

On Fri, 2021-07-16 at 09:17 -0400, Stephen Frost wrote:
> > > An additional thing that we should really be mentioning is to tell
> > > people to go in and TRUNCATE all of their UNLOGGED tables before going
> > > through this process, otherwise the rsync will end up spending a bunch
> > > of time copying the files for UNLOGGED relations which you really don't
> > > want.
> > 
> > I have thought about that some more, and I am not certain that we should
> > unconditionally recommend that.  Perhaps the pain of rebuilding the
> > unlogged table on the primary would be worse than rsyncing it to the
> > standby.
> 
> I disagree entirely.  The reason to even consider using this approach is
> to minimize the time required to get things back online and there's no
> question that having the unlogged tables get rsync'd across would
> increase the time required.

I am not totally convinced that minimal down time is always more important
than keeping your unlogged tables, but I have adapted the patch accordingly.

> > The documentation already mentions
> > 
> >   "Unfortunately, rsync needlessly copies files associated with temporary
> >    and unlogged tables because these files don't normally exist on standby
> >    servers."
> > 
> > I'd say that is good enough, and people can draw their conclusions from
> > that.
> 
> I disagree.  Instead, we should have explicit steps included which
> detail how to find and truncate unlogged tables and what to do to remove
> or exclude temporary files once the server is shut down.

Ok, done.

> > Recommend using the --relative option of rsync for clarity
> > and adapt the code samples accordingly.
> > Using relative paths makes clearer what is meant by "current
> > directory" and "remote_dir".
> 
> I'm not really convinced that this is actually a positive change, though
> I don't know that it's really a negative one either.  In general, I
> prefer fully qualified paths to try and make things very clear about
> what's happening, but this is also a bit of an odd case due to hard
> links, etc.

I normally prefer absolute paths as well.
But that is the only way I got it to run, and I think that in this
case it adds clarity to have the data directories relative to your
current working directory.

> > Add a reminder that "standby.signal" needs to be created.
> 
> This makes sense to include, certainly, but it should be an explicit
> step, not just a "don't forget" note at the end.  I'm not really sure
> why we talk about "log shipping" either..?  Wouldn't it make more sense
> to have something like:
> 
> g. Configure standby servers
> 
> Review the prior configuration of the standby servers and set up the
> same configuration in the newly rsync'd directory.
> 
> 1. touch /path/to/replica/standby.signal
> 2. Configure restore_command to pull from WAL archive
> 3. For streaming replicas, configure primary_conninfo

Ok, I have modified the final step like this.  That is better than
talking about log shipping.

Patch V3 attached.

Yours,
Laurenz Albe

Вложения

Re: Improve documentation for pg_upgrade, standbys and rsync

От
Stephen Frost
Дата:
Greetings,

* Laurenz Albe (laurenz.albe@cybertec.at) wrote:
> Thanks for looking at this!

Sure.  Thanks for working on it!

> On Fri, 2021-07-16 at 09:17 -0400, Stephen Frost wrote:
> > > > An additional thing that we should really be mentioning is to tell
> > > > people to go in and TRUNCATE all of their UNLOGGED tables before going
> > > > through this process, otherwise the rsync will end up spending a bunch
> > > > of time copying the files for UNLOGGED relations which you really don't
> > > > want.
> > >
> > > I have thought about that some more, and I am not certain that we should
> > > unconditionally recommend that.  Perhaps the pain of rebuilding the
> > > unlogged table on the primary would be worse than rsyncing it to the
> > > standby.
> >
> > I disagree entirely.  The reason to even consider using this approach is
> > to minimize the time required to get things back online and there's no
> > question that having the unlogged tables get rsync'd across would
> > increase the time required.
>
> I am not totally convinced that minimal down time is always more important
> than keeping your unlogged tables, but I have adapted the patch accordingly.

Having the unlogged tables end up on replicas seems awkward also because
they really shouldn't be there and they'd never end up getting cleaned
up unless the replica crashed or was rebuilt..

> > > The documentation already mentions
> > >
> > >   "Unfortunately, rsync needlessly copies files associated with temporary
> > >    and unlogged tables because these files don't normally exist on standby
> > >    servers."
> > >
> > > I'd say that is good enough, and people can draw their conclusions from
> > > that.
> >
> > I disagree.  Instead, we should have explicit steps included which
> > detail how to find and truncate unlogged tables and what to do to remove
> > or exclude temporary files once the server is shut down.
>
> Ok, done.

Great, thanks, it's not quite this simple, unfortunately, more below..

> > > Recommend using the --relative option of rsync for clarity
> > > and adapt the code samples accordingly.
> > > Using relative paths makes clearer what is meant by "current
> > > directory" and "remote_dir".
> >
> > I'm not really convinced that this is actually a positive change, though
> > I don't know that it's really a negative one either.  In general, I
> > prefer fully qualified paths to try and make things very clear about
> > what's happening, but this is also a bit of an odd case due to hard
> > links, etc.
>
> I normally prefer absolute paths as well.
> But that is the only way I got it to run, and I think that in this
> case it adds clarity to have the data directories relative to your
> current working directory.

I'm pretty curious that you weren't able to get it to run with absolute
paths..

> > > Add a reminder that "standby.signal" needs to be created.
> >
> > This makes sense to include, certainly, but it should be an explicit
> > step, not just a "don't forget" note at the end.  I'm not really sure
> > why we talk about "log shipping" either..?  Wouldn't it make more sense
> > to have something like:
> >
> > g. Configure standby servers
> >
> > Review the prior configuration of the standby servers and set up the
> > same configuration in the newly rsync'd directory.
> >
> > 1. touch /path/to/replica/standby.signal
> > 2. Configure restore_command to pull from WAL archive
> > 3. For streaming replicas, configure primary_conninfo
>
> Ok, I have modified the final step like this.  That is better than
> talking about log shipping.

Yup, glad you agree on that.

> From 43453dc7379f87ca6638c80c9ec6bf528f8e2e28 Mon Sep 17 00:00:00 2001
> From: Laurenz Albe <laurenz.albe@cybertec.at>
> Date: Thu, 22 Jul 2021 15:33:59 +0200
> Subject: [PATCH] Improve doc for pg_upgrade and standby servers
>
> Recommend truncating or removing unlogged and temporary
> tables to speed up "rsync".  Since this is best done in
> the step "Prepare for standby server upgrades", move that
> step to precede "Stop both servers".
>
> Recommend using the --relative option of rsync for clarity
> and adapt the code samples accordingly.
> Using relative paths makes clearer what is meant by "current
> directory" and "remote_dir".
>
> Rewrite the final substep to not mention "log shipping".
> Rather, provide a list of the necessary configuration steps.
> ---
>  doc/src/sgml/ref/pgupgrade.sgml | 96 +++++++++++++++++++++------------
>  1 file changed, 63 insertions(+), 33 deletions(-)
>
> diff --git a/doc/src/sgml/ref/pgupgrade.sgml b/doc/src/sgml/ref/pgupgrade.sgml
> index a83c63cd98..3ccb311ff7 100644
> --- a/doc/src/sgml/ref/pgupgrade.sgml
> +++ b/doc/src/sgml/ref/pgupgrade.sgml
> @@ -324,6 +324,35 @@ make prefix=/usr/local/pgsql.new install
>      </para>
>     </step>
>
> +   <step id="prepare-standby-upgrade">
> +    <title>Prepare for standby server upgrades</title>
> +
> +    <para>
> +     If you are upgrading standby servers using methods outlined in section <xref
> +     linkend="pgupgrade-step-replicas"/>, you should consider dropping temporary
> +     tables and truncating unlogged tables on the primary, since that will speed up
> +     <application>rsync</application> and keep the down time short.
> +     You could run the following <application>psql</application> commands
> +     in all databases:
> +
> +<programlisting>
> +SELECT format('DROP TABLE %s', oid::regclass) FROM pg_class WHERE relpersistence = 't' \gexec
> +SELECT format('TRUNCATE %s', oid::regclass) FROM pg_class WHERE relpersistence = 'u' \gexec
> +</programlisting>

Temporary tables aren't actually visible across different backends, nor
should they exist once the system has been shut down, but sometimes they
do get left around due to a crash, so the above won't actually work and
isn't the way to deal with those.  The same can also happen with
temporary files that we create which end up in pgsql_tmp.

We could possibly exclude pgsql_tmp in the rsync command, but cleaning
up the temporary table files would involve something more complicated
like using 'find' to search for any '^t[0-9]+_[0-9]+.*$' files or
something along those lines.

Though, for that matter we should really be looking through all of the
directories and files that pg_basebackup excludes and considering if
they should somehow be excluded.  There's no easy way to exclude
everything that pg_basebackup would with just an rsync because the logic
is a bit complicated (which is why I was saying we really need a proper
tool...) but we could probably provide a somewhat better rsync command
by going through that list and excluding what makes sense to exclude.
We could also provide another explicit before-rsync step to review all
the temp table files and move them or remove them, depending on how
comfortable one is with hacking around in the data directory.

This, of course, all comes back to the original complaint I had about
documenting this approach, which is that these things should only be
done by someone extremely familiar with the PG codebase, until and
unless we write an actual tool to do this.

> +     (There will be a mismatch if old standby servers were shut down
> +     before the old primary or if the old standby servers are still running.)

Would probably be good to note that if the standby's were shut down
before the primary then this method can *not* be used safely...  The
above leaves it unclear about if the mismatch is an issue or not.  I get
that this was in the original docs, but still would be good to improve
it.

Thanks!

Stephen

Вложения

Re: Improve documentation for pg_upgrade, standbys and rsync

От
Laurenz Albe
Дата:
On Mon, 2021-07-26 at 15:11 -0400, Stephen Frost wrote:
> > > > > An additional thing that we should really be mentioning is to tell
> > > > > people to go in and TRUNCATE all of their UNLOGGED tables before going
> > > > > through this process, otherwise the rsync will end up spending a bunch
> > > > > of time copying the files for UNLOGGED relations which you really don't
> > > > > want.
> > 
> > Ok, done.
> 
> Great, thanks, it's not quite this simple, unfortunately, more below..
>
> > +    <para>
> > +     If you are upgrading standby servers using methods outlined in section <xref
> > +     linkend="pgupgrade-step-replicas"/>, you should consider dropping temporary
> > +     tables and truncating unlogged tables on the primary, since that will speed up
> > +     <application>rsync</application> and keep the down time short.
> > +     You could run the following <application>psql</application> commands
> > +     in all databases:
> > +
> > +<programlisting>
> > +SELECT format('DROP TABLE %s', oid::regclass) FROM pg_class WHERE relpersistence = 't' \gexec
> > +SELECT format('TRUNCATE %s', oid::regclass) FROM pg_class WHERE relpersistence = 'u' \gexec
> > +</programlisting>
> 
> Temporary tables aren't actually visible across different backends, nor
> should they exist once the system has been shut down, but sometimes they
> do get left around due to a crash, so the above won't actually work and
> isn't the way to deal with those.  The same can also happen with
> temporary files that we create which end up in pgsql_tmp.
> 
> We could possibly exclude pgsql_tmp in the rsync command, but cleaning
> up the temporary table files would involve something more complicated
> like using 'find' to search for any '^t[0-9]+_[0-9]+.*$' files or
> something along those lines.
> 
> Though, for that matter we should really be looking through all of the
> directories and files that pg_basebackup excludes and considering if
> they should somehow be excluded.  There's no easy way to exclude
> everything that pg_basebackup would with just an rsync because the logic
> is a bit complicated (which is why I was saying we really need a proper
> tool...) but we could probably provide a somewhat better rsync command
> by going through that list and excluding what makes sense to exclude.
> We could also provide another explicit before-rsync step to review all
> the temp table files and move them or remove them, depending on how
> comfortable one is with hacking around in the data directory.
> 
> This, of course, all comes back to the original complaint I had about
> documenting this approach, which is that these things should only be
> done by someone extremely familiar with the PG codebase, until and
> unless we write an actual tool to do this.

I agree with what you write, but that sounds like you are arguing for
a code patch rather than for documentation to enable the user to do
that manually, which is what I believe you said initially.

My two statements will get rid of temporary tables left behind after
a crash and truncate unlogged tables, which should be an improvement.

Of course it would be good to get rid of orphaned files left behind
after a crash, but, as you say, that is not so easy.

I'd say that writing tools to do better than my two SQL statements
is nice to have, but beyond the scope of this documentation patch.

> > > > Recommend using the --relative option of rsync for clarity
> > > > and adapt the code samples accordingly.
> > > > Using relative paths makes clearer what is meant by "current
> > > > directory" and "remote_dir".
> > 
> > I normally prefer absolute paths as well.
> > But that is the only way I got it to run, and I think that in this
> > case it adds clarity to have the data directories relative to your
> > current working directory.
> 
> I'm pretty curious that you weren't able to get it to run with absolute
> paths..

I tried a couple of times with a test cluster and failed.

Part of the confustion for me is that you are supposed to run the
rsync from a certain directory, which seems weird if paths are absolute.
Run from *any* directory above the old and the new cluster?

"Relative to my current directory" makes more sense to me here.

> > +     (There will be a mismatch if old standby servers were shut down
> > +     before the old primary or if the old standby servers are still running.)
> 
> Would probably be good to note that if the standby's were shut down
> before the primary then this method can *not* be used safely...  The
> above leaves it unclear about if the mismatch is an issue or not.  I get
> that this was in the original docs, but still would be good to improve
> it.

Agreed.

Yours,
Laurenz Albe




Re: Improve documentation for pg_upgrade, standbys and rsync

От
Robert Haas
Дата:
On Mon, Jul 26, 2021 at 3:11 PM Stephen Frost <sfrost@snowman.net> wrote:
> * Laurenz Albe (laurenz.albe@cybertec.at) wrote:
> > Thanks for looking at this!
>
> Sure.  Thanks for working on it!

Stephen, do you intend to do something about this patch in terms of
getting it committed? You're the only reviewer but haven't responded
to the thread for more than 5 months.

I don't feel that I know this area of the documentation well enough to
feel comfortable passing judgement on whether this change is an
improvement or not. However I do feel somewhat uncomfortable with
this:

-   <step>
-    <title>Prepare for standby server upgrades</title>
-
-    <para>
-     If you are upgrading standby servers using methods outlined in
section <xref
-     linkend="pgupgrade-step-replicas"/>, verify that the old standby
-     servers are caught up by running <application>pg_controldata</application>
-     against the old primary and standby clusters.  Verify that the
-     <quote>Latest checkpoint location</quote> values match in all clusters.
-     (There will be a mismatch if old standby servers were shut down
-     before the old primary or if the old standby servers are still running.)
-     Also, make sure <varname>wal_level</varname> is not set to
-     <literal>minimal</literal> in the
<filename>postgresql.conf</filename> file on the
-     new primary cluster.
-    </para>
-   </step>

Right now, we say that you should stop the standby servers and then
prepared for standby server upgrades. With this patch, we say that you
should first prepare for standby server upgrades, and then stop the
standby servers. But the last part of the text about preparing for
standby server upgrades now mentions things to be done after carrying
out the next step where the servers are actually stopped. That seems
confusing. Perhaps we need two separate steps here, one to be
performed before stopping both servers and the other after.

Also, let me express my general terror at the idea of anyone actually
using this procedure.

Regards,

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: Improve documentation for pg_upgrade, standbys and rsync

От
Stephen Frost
Дата:
Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:
> On Mon, Jul 26, 2021 at 3:11 PM Stephen Frost <sfrost@snowman.net> wrote:
> > * Laurenz Albe (laurenz.albe@cybertec.at) wrote:
> > > Thanks for looking at this!
> >
> > Sure.  Thanks for working on it!
>
> Stephen, do you intend to do something about this patch in terms of
> getting it committed? You're the only reviewer but haven't responded
> to the thread for more than 5 months.

I tried to be clear in the last email on the thread, the one which you
just responded to, here:

* Stephen Frost (sfrost@snowman.net) wrote:
> This, of course, all comes back to the original complaint I had about
> documenting this approach, which is that these things should only be
> done by someone extremely familiar with the PG codebase, until and
> unless we write an actual tool to do this.

To be more explicit though- we should write a tool to do this.  We
shouldn't try to document a way to do it because it's hard to get right.
While rsync is very capable, what's needed to really do this goes beyond
what we could reasonably put into any rsync command or really even into
a documented procedure.  I get that we already have it documented (and
I'll note that doing so was against my recommendation..) and that some
folks (likely those who follow this mailing list) have had success using
it, but I'd really rather we just take it out and put it on a wiki
somewhere as a "we need a tool that does this stuff" and hope that
someone finds time to write one.

> I don't feel that I know this area of the documentation well enough to
> feel comfortable passing judgement on whether this change is an
> improvement or not. However I do feel somewhat uncomfortable with
> this:
>
> -   <step>
> -    <title>Prepare for standby server upgrades</title>
> -
> -    <para>
> -     If you are upgrading standby servers using methods outlined in
> section <xref
> -     linkend="pgupgrade-step-replicas"/>, verify that the old standby
> -     servers are caught up by running <application>pg_controldata</application>
> -     against the old primary and standby clusters.  Verify that the
> -     <quote>Latest checkpoint location</quote> values match in all clusters.
> -     (There will be a mismatch if old standby servers were shut down
> -     before the old primary or if the old standby servers are still running.)
> -     Also, make sure <varname>wal_level</varname> is not set to
> -     <literal>minimal</literal> in the
> <filename>postgresql.conf</filename> file on the
> -     new primary cluster.
> -    </para>
> -   </step>
>
> Right now, we say that you should stop the standby servers and then
> prepared for standby server upgrades. With this patch, we say that you
> should first prepare for standby server upgrades, and then stop the
> standby servers. But the last part of the text about preparing for
> standby server upgrades now mentions things to be done after carrying
> out the next step where the servers are actually stopped. That seems
> confusing. Perhaps we need two separate steps here, one to be
> performed before stopping both servers and the other after.

It should really be both- things to do on the primary ahead of time
(truncate all unlogged tables, make sure there aren't any orphaned
temporary tables, etc), and then things to do on the replicas after
shutting the primary down (basically, make sure they are fully caught up
with where the primary was at shutdown).  I tried to explain that in my
prior email but perhaps didn't do a very good job.

> Also, let me express my general terror at the idea of anyone actually
> using this procedure.

I mean, yeah, I agree.

Thanks,

Stephen

Вложения

Re: Improve documentation for pg_upgrade, standbys and rsync

От
Bruce Momjian
Дата:
On Tue, Apr  5, 2022 at 01:10:38PM -0400, Stephen Frost wrote:
> To be more explicit though- we should write a tool to do this.  We
> shouldn't try to document a way to do it because it's hard to get right.
> While rsync is very capable, what's needed to really do this goes beyond
> what we could reasonably put into any rsync command or really even into
> a documented procedure.  I get that we already have it documented (and
> I'll note that doing so was against my recommendation..) and that some
> folks (likely those who follow this mailing list) have had success using
> it, but I'd really rather we just take it out and put it on a wiki
> somewhere as a "we need a tool that does this stuff" and hope that
> someone finds time to write one.

Well, I think pg_upgrade needs a tool, let alone for standby upgrades,
but 13 years in, no one has written one, so I am not holding my breath. 
Also, we need to document the procedure _somewhere_ --- if we don't the
only procedure is embedded in a tool. and that seems even worse than
what we have now.

> It should really be both- things to do on the primary ahead of time
> (truncate all unlogged tables, make sure there aren't any orphaned
> temporary tables, etc), and then things to do on the replicas after
> shutting the primary down (basically, make sure they are fully caught up
> with where the primary was at shutdown).  I tried to explain that in my
> prior email but perhaps didn't do a very good job.
> 
> > Also, let me express my general terror at the idea of anyone actually
> > using this procedure.
> 
> I mean, yeah, I agree.

I thought that was true for pg_upgrade in general?  ;-)

Seems like a pull up your sleeves and hold your nose --- I am good at
those tasks.  ;-)  Should I work on this?  Tangentially, I see that my
old macros fastgetattr and heap_getattr have finally been retired by
commit e27f4ee0a7.  :-)

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson




Re: Improve documentation for pg_upgrade, standbys and rsync

От
Laurenz Albe
Дата:
On Tue, 2022-04-05 at 12:38 -0400, Robert Haas wrote:
> Also, let me express my general terror at the idea of anyone actually
> using this procedure.

I did, and I couldn't get it to work with absolute paths, and using
relative paths seemed to me to be more intuitive anyway, hence the patch.

Originally that was the only change I wanted to make to the documentation,
but you know how it is: as soon as you touch something like this, someone
will (rightly so) prod you and say "while you change this, that other
thing there should also be improved", and the patch gets more
and more invasive.

I agree with the scariness of this, but I prefer to have it in the
documentation anyway; at least as long as we have nothing better (which
is always the enemy of the good).

Yours,
Laurenz Albe




Re: Improve documentation for pg_upgrade, standbys and rsync

От
Laurenz Albe
Дата:
On Tue, 2022-04-05 at 13:10 -0400, Stephen Frost wrote:
> * Robert Haas (robertmhaas@gmail.com) wrote:
> On Mon, Jul 26, 2021 at 3:11 PM Stephen Frost <sfrost@snowman.net> wrote:
> > * Laurenz Albe (laurenz.albe@cybertec.at) wrote:
> > > > Thanks for looking at this!
> > > 
> > > Sure.  Thanks for working on it!
> > 
> > Stephen, do you intend to do something about this patch in terms of
> > getting it committed? You're the only reviewer but haven't responded
> > to the thread for more than 5 months.
> 
> I tried to be clear in the last email on the thread, the one which you
> just responded to, here:
> 
> * Stephen Frost (sfrost@snowman.net) wrote:
> > This, of course, all comes back to the original complaint I had about
> > documenting this approach, which is that these things should only be
> > done by someone extremely familiar with the PG codebase, until and
> > unless we write an actual tool to do this.
> 
> To be more explicit though- we should write a tool to do this.  We
> shouldn't try to document a way to do it because it's hard to get right.

I see no agreement on this patch.  I'll withdraw it from the commitfest
to avoid hogging resources.  Thanks everyone for the review.

Yours,
Laurenz Albe