Обсуждение: Continue work on changes to recovery.conf API

Поиск
Список
Период
Сортировка

Continue work on changes to recovery.conf API

От
Sergei Kornilov
Дата:
Hello all
I would like to continue work on new recovery api proposed in thread [1]. We have some form of consensus but thread has
beeninactive for a long time and i hope i can help.
 

I start from last published patch [2] and make some changes:
- updated to current HEAD
- made the patch pass make check-world run
- removed recovery.auto.conf lookup and changed pg_basebackup to append recovery parameters to postgresql.auto.conf in
bothplain and tar formats
 
- revert back trigger_file logic, but rename to promote_signal_file. I think here is no reason to make GUC only for
directorypath with hardcoded file name as was discussed in original thread.
 

Any feedback is strongly appreciated. Thank you

A brief summary of proposed changes:
- if recovery.conf present during start we report FATAL error about old style config
- recovery mode start if exists file "recovery.signal" in datadir, all old recovery_target_* options are replaced by
recovery_target_typeand recovery_target_value GUC
 
- start standby mode - by file "standby.signal"
- if present both - standby wins
- pg_basebackup -R will append primary_conninfo and primary_slot_name to postgresql.auto.conf config
- all new GUC are still PGC_POSTMASTER

PS: i will add an entry to 2018-09 CommitFest when it is become available.

regards, Sergei

[1]
https://www.postgresql.org/message-id/flat/CANP8%2BjLO5fmfudbB1b1iw3pTdOK1HBM%3DxMTaRfOa5zpDVcqzew%40mail.gmail.com
[2] https://www.postgresql.org/message-id/CANP8+jJo-LO4xtS7G=iN7PG5o60WeWdKEAn+X+Gnf+RNay5jGQ@mail.gmail.com


Re: Continue work on changes to recovery.conf API

От
Sergei Kornilov
Дата:
I completely forgot attach patch, sorry. Attached now
Вложения

Re: Continue work on changes to recovery.conf API

От
Craig Ringer
Дата:
On 22 June 2018 at 02:48, Sergei Kornilov <sk@zsrv.org> wrote:
I completely forgot attach patch, sorry. Attached now


Sergei,

It's great to see you picking this up.  I'll try to find time to review this for the next CF cycle. Please poke me if you don't hear from me, I do get distracted. It's long past time to get these changes in for good.


--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Continue work on changes to recovery.conf API

От
Sergei Kornilov
Дата:
Hello

Commitfest 2018-09 is now open and, as planned, i create one entry: https://commitfest.postgresql.org/19/1711/
Also i attach new version due merge conflict with HEAD.

regards, Sergei
Вложения

Re: Continue work on changes to recovery.conf API

От
Peter Eisentraut
Дата:
On 01/07/2018 13:45, Sergei Kornilov wrote:
> Commitfest 2018-09 is now open and, as planned, i create one entry: https://commitfest.postgresql.org/19/1711/
> Also i attach new version due merge conflict with HEAD.

Could you please describe in detail what this current patch does?

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Continue work on changes to recovery.conf API

От
Sergei Kornilov
Дата:
Hello

Current patch moves recovery.conf settings into GUC system:
- if startup process found recovery.conf - it throw error
- recovery mode is turned on if new special file recovery.signal found
- standby_mode setting was removed. Standby mode can be enabled if startup found new special file standby.signal
- if present both standby.signal and recovery.signal - we use standby mode
(this design from previous thread)
- recovery parameters recovery_target_inclusive, recovery_target_timeline, recovery_target_action are new GUC without
logicchanges
 
- recovery_target (immediate), recovery_target_name, recovery_target_time, recovery_target_xid, recovery_target_lsn are
replacedto recovery_target_type and recovery_target_value (was discussed and changed in previous thread)
 
- restore_command, archive_cleanup_command, recovery_end_command, primary_conninfo, primary_slot_name and
recovery_min_apply_delayare just new GUC
 
- trigger_file was renamed to promote_signal_file for clarify (my change, in prev thread was removed)
- all new GUC are PGC_POSTMASTER (discussed in prev thread)
- pg_basebackup with -R (--write-recovery-conf) option will create standby.signal file and append primary_conninfo and
primary_slot_name(if was used) to postgresql.auto.conf instead of writing new recovery.conf
 
- appropriate changes in tests and docs

regards, Sergei


Re: Continue work on changes to recovery.conf API

От
Sergei Kornilov
Дата:
Hello

Attached small update due the merge conflict with HEAD docs.

regards, Sergei
Вложения

Re: Continue work on changes to recovery.conf API

От
Peter Eisentraut
Дата:
On 29/08/2018 17:43, Sergei Kornilov wrote:
> Current patch moves recovery.conf settings into GUC system:
> - if startup process found recovery.conf - it throw error

OK

> - recovery mode is turned on if new special file recovery.signal found

OK

> - standby_mode setting was removed. Standby mode can be enabled if startup found new special file standby.signal

OK

> - if present both standby.signal and recovery.signal - we use standby mode
> (this design from previous thread)

Didn't find the discussion on that and don't understand the reasons, but
seems OK and easy to change if we don't like it.

> - recovery parameters recovery_target_inclusive, recovery_target_timeline, recovery_target_action are new GUC without
logicchanges
 

OK

> - recovery_target (immediate), recovery_target_name, recovery_target_time, recovery_target_xid, recovery_target_lsn
arereplaced to recovery_target_type and recovery_target_value (was discussed and changed in previous thread)
 

I think this was the major point of contention.  I reread the old
thread, and it's still not clear why we need to change this.  _type and
_value look like an EAV system to me.  GUC variables should be
verifiable independent of another variable.  The other idea of using
only one variable seems better, but using two variables seems like a
poor compromise between one and five.

I propose to move this patch forward, keep the settings as they are.  It
can be another patch to rename or reshuffle them.

> - restore_command, archive_cleanup_command, recovery_end_command, primary_conninfo, primary_slot_name and
recovery_min_apply_delayare just new GUC
 

OK

> - trigger_file was renamed to promote_signal_file for clarify (my change, in prev thread was removed)

OK to add the "promote" prefix, but I'd prefer to keep the "trigger"
name.  There is a lot of discussion and knowledge around that.  Seems
unnecessary to change.

> - all new GUC are PGC_POSTMASTER (discussed in prev thread)

OK

> - pg_basebackup with -R (--write-recovery-conf) option will create standby.signal file and append primary_conninfo
andprimary_slot_name (if was used) to postgresql.auto.conf instead of writing new recovery.conf
 

I wonder if that would cause any problems.  The settings in
postgresql.auto.conf are normally not PGC_POSTMASTER, otherwise you
couldn't use ALTER SYSTEM to put them there.  Maybe it's OK.

> - appropriate changes in tests and docs

looks good

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Continue work on changes to recovery.conf API

От
Tom Lane
Дата:
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
> I think this was the major point of contention.  I reread the old
> thread, and it's still not clear why we need to change this.  _type and
> _value look like an EAV system to me.  GUC variables should be
> verifiable independent of another variable.

No, they MUST be independently verifiable.  The interactions between
the check_xxx functions in this patch are utterly unsafe.  We've
learned that lesson before.

> I propose to move this patch forward, keep the settings as they are.  It
> can be another patch to rename or reshuffle them.

Please do not commit this in this state.

> I wonder if that would cause any problems.  The settings in
> postgresql.auto.conf are normally not PGC_POSTMASTER, otherwise you
> couldn't use ALTER SYSTEM to put them there.  Maybe it's OK.

Actually, that works fine.  You do have to restart the postmaster
to make them take effect, but it's the same as if you edited the
main config file by hand.

            regards, tom lane


Re: Continue work on changes to recovery.conf API

От
Andres Freund
Дата:
Hi,

On 2018-09-28 16:36:35 -0400, Tom Lane wrote:
> Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
> > I think this was the major point of contention.  I reread the old
> > thread, and it's still not clear why we need to change this.  _type and
> > _value look like an EAV system to me.  GUC variables should be
> > verifiable independent of another variable.
> 
> No, they MUST be independently verifiable.  The interactions between
> the check_xxx functions in this patch are utterly unsafe.  We've
> learned that lesson before.

I'm not sure those concerns apply quite the same way here - we can move
the interdependent verification to the the point where they're used
first rather than relying on guc.c infrastructure. We already have
plenty of checks interdependent that way, without it causing many
problems.  UI wise that's not too bad, if they're things that cannot be
changed arbitrarily at runtime.

Greetings,

Andres Freund


Re: Continue work on changes to recovery.conf API

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> On 2018-09-28 16:36:35 -0400, Tom Lane wrote:
>> No, they MUST be independently verifiable.  The interactions between
>> the check_xxx functions in this patch are utterly unsafe.  We've
>> learned that lesson before.

> I'm not sure those concerns apply quite the same way here - we can move
> the interdependent verification to the the point where they're used
> first rather than relying on guc.c infrastructure.

And, if they're bad, what happens?  Recovery fails?

I don't think it's a great idea to lose out on whatever error checking
the existing GUC infrastructure can provide, just so as to use a GUC
design that's not very nice in the first place.

            regards, tom lane


Re: Continue work on changes to recovery.conf API

От
Sergei Kornilov
Дата:
Hello

thank you for review!

>>  - if present both standby.signal and recovery.signal - we use standby mode
>>  (this design from previous thread)
>
> Didn't find the discussion on that and don't understand the reasons, but
> seems OK and easy to change if we don't like it.
I meant this was implemented in previous proposed patch and i do not changed this. I didn't find the discussion on that
too...

>>  - recovery_target (immediate), recovery_target_name, recovery_target_time, recovery_target_xid, recovery_target_lsn
arereplaced to recovery_target_type and recovery_target_value (was discussed and changed in previous thread)
 
>
> I think this was the major point of contention. I reread the old
> thread, and it's still not clear why we need to change this. _type and
> _value look like an EAV system to me. GUC variables should be
> verifiable independent of another variable. The other idea of using
> only one variable seems better, but using two variables seems like a
> poor compromise between one and five.

> No, they MUST be independently verifiable. The interactions between the check_xxx functions in this patch are utterly
unsafe.

Understood, thank you.
I will implement set of current five recovery_target* settings and would like to leave reorganization for futher
pathes.

Not sure about one thing. All recovery_target settings change one internal recoveryTarget variable. GUC infrastructure
willguarantee behavior if user erroneously set multiple different recovery_target*? I mean check_* and assign_* will be
calledin order and so latest config line wins?
 

>>  - trigger_file was renamed to promote_signal_file for clarify (my change, in prev thread was removed)
>
> OK to add the "promote" prefix, but I'd prefer to keep the "trigger"
> name. There is a lot of discussion and knowledge around that. Seems
> unnecessary to change.
Ok, will change to promote_trigger_file

>>  - pg_basebackup with -R (--write-recovery-conf) option will create standby.signal file and append primary_conninfo
andprimary_slot_name (if was used) to postgresql.auto.conf instead of writing new recovery.conf
 
>
> I wonder if that would cause any problems. The settings in
> postgresql.auto.conf are normally not PGC_POSTMASTER, otherwise you
> couldn't use ALTER SYSTEM to put them there. Maybe it's OK.
Actually we can use ALTER SYSTEM to put PGC_POSTMASTER settings. I tried now "alter system set max_connections to
300;",postgresql.auto.conf was changed and affect max_connections during database start.
 
Of course, we can not reload PGC_POSTMASTER settings, but during start we call regular ParseConfigFile and can override
PGC_POSTMASTER.

regards, Sergei


Re: Continue work on changes to recovery.conf API

От
Sergei Kornilov
Дата:
Hello

I complete new version of this patch. I've attached it.

In this version i remove proposed recovery_target_type/recovery_target_value and implement set of current settings:
recovery_target(immediate), recovery_target_name, recovery_target_time, recovery_target_xid, recovery_target_lsn
 

>>>   - trigger_file was renamed to promote_signal_file for clarify (my change, in prev thread was removed)
>>
>>  OK to add the "promote" prefix, but I'd prefer to keep the "trigger"
>>  name. There is a lot of discussion and knowledge around that. Seems
>>  unnecessary to change.
>
> Ok, will change to promote_trigger_file
Renamed to promote_trigger_file

Possible we need something like separate xlog_guc.c and header for related functions and definition?

regards, Sergei
Вложения

Re: Continue work on changes to recovery.conf API

От
Sergei Kornilov
Дата:
Hi

I attached new version of this patch due merge conflict with pg_promote function.

regards, Sergei
Вложения

Re: Continue work on changes to recovery.conf API

От
Robert Haas
Дата:
On Fri, Sep 28, 2018 at 4:20 PM Peter Eisentraut
<peter.eisentraut@2ndquadrant.com> wrote:
> > - recovery_target (immediate), recovery_target_name, recovery_target_time, recovery_target_xid, recovery_target_lsn
arereplaced to recovery_target_type and recovery_target_value (was discussed and changed in previous thread)
 
>
> I think this was the major point of contention.  I reread the old
> thread, and it's still not clear why we need to change this.  _type and
> _value look like an EAV system to me.  GUC variables should be
> verifiable independent of another variable.  The other idea of using
> only one variable seems better, but using two variables seems like a
> poor compromise between one and five.

+1.  I like one best, I can live with five, and I think two stinks.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: Continue work on changes to recovery.conf API

От
Peter Eisentraut
Дата:
On 30/10/2018 14:30, Sergei Kornilov wrote:
> I attached new version of this patch due merge conflict with pg_promote function.

This patch looks pretty good to me functionality-wise.  There are some
code details to work through, listed below.

In this review, I'm skipping over your documentation changes.  It seems
you have found all the places that mention recovery.conf and updated
them adequately.  But I think in some cases we will need to take a few
steps back and reword a paragraph or section altogether.  For example,
there will no longer be a reason for recovery-config.sgml to be a
separate chapter if it's all part of the main GUC system.  We can work
through that later.

Code discussion:

- useless whitespace change in xlog.c

- I think we can drop logRecoveryParameters().  Users can now inspect
those parameters using the normal GUC mechanisms.  (They were all DEBUG2
anyway, so it's not like many users will be missing this.)  Then you can
also drop RecoveryTargetActionText().

- Why introduce MAXRESTOREPOINTNAMELEN?  If you think this is useful,
then we could do it as a separate patch.

- Make the help text change in pg_archivecleanup.c similar to pg_standby.c.

- In pg_basebackup.c, duplicate defines PG_AUTOCONF_FILENAME and
STANDBY_SIGNAL_FILE.  See that you can put those into a header file
somewhere.

- This stuff breaks translation strings:

    printf(_("  -R, --write-recovery-conf\n"
-            "                         write recovery.conf for
replication\n"));
+            "                         append replication config to "
PG_AUTOCONF_FILENAME "\n"
+            "                         and place " STANDBY_SIGNAL_FILE "
file\n"));

Use %s placeholders, or better yet replace it in a more compact way.

- The test function $node_standby->request_standby_mode() could use a
better name.  How about set_ instead of request_ ?

- New string GUCs should have "" instead of NULL as their default in
guc.c.  (Not sure whether that is strictly necessary, but it seems good
to be consistent.)

- Help strings in guc.c should end with a period.

- Renaming the promote and fallback_promote files seems unnecessary for
this patch, and I would take it out.  Otherwise, the change in pg_ctl.c
is out of date with the comment above it.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Continue work on changes to recovery.conf API

От
Sergei Kornilov
Дата:
Hello

Thank you! Here is patch update addressing your comments.

> - useless whitespace change in xlog.c
Oops, did not notice. Fixed.

> - I think we can drop logRecoveryParameters(). Users can now inspect
> those parameters using the normal GUC mechanisms. (They were all DEBUG2
> anyway, so it's not like many users will be missing this.) Then you can
> also drop RecoveryTargetActionText().
Agreed, done.

> - Why introduce MAXRESTOREPOINTNAMELEN? If you think this is useful,
> then we could do it as a separate patch.
Reverted back. This was changed in prev proposal.

> - Make the help text change in pg_archivecleanup.c similar to pg_standby.c.
Changed.

> - In pg_basebackup.c, duplicate defines PG_AUTOCONF_FILENAME and
> STANDBY_SIGNAL_FILE. See that you can put those into a header file
> somewhere.
I move constants from xlog.h to xlog_internal.h. Also i revert back RECOVERY_COMMAND_FILE and RECOVERY_COMMAND_DONE
intoxlog.c (was moved to xlog.h few weeks ago)
 
But i have no good idea for PG_AUTOCONF_FILENAME. Seems most src/bin/ application uses hardcoded file path. How about
miscadmin.h?

> - This stuff breaks translation strings:
>
>     printf(_(" -R, --write-recovery-conf\n"
> - " write recovery.conf for
> replication\n"));
> + " append replication config to "
> PG_AUTOCONF_FILENAME "\n"
> + " and place " STANDBY_SIGNAL_FILE "
> file\n"));
>
> Use %s placeholders, or better yet replace it in a more compact way.
Maybe leave just "write configuration for replication" with explanation in docs, as was before?

> - The test function $node_standby->request_standby_mode() could use a
> better name. How about set_ instead of request_ ?
Sounds good, changed.

> - New string GUCs should have "" instead of NULL as their default in
> guc.c. (Not sure whether that is strictly necessary, but it seems good
> to be consistent.)
> - Help strings in guc.c should end with a period.
Fixed

> - Renaming the promote and fallback_promote files seems unnecessary for
> this patch, and I would take it out. Otherwise, the change in pg_ctl.c
> is out of date with the comment above it.
Agreed, revert back.

regards, Sergei
Вложения

Re: Continue work on changes to recovery.conf API

От
Peter Eisentraut
Дата:
On 13/11/2018 15:57, Sergei Kornilov wrote:
> Thank you! Here is patch update addressing your comments.

I went over the patch and did a bunch of small refinements.  I have also
updated the documentation more extensively.  The attached patch is
committable to me.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

Re: Continue work on changes to recovery.conf API

От
Michael Paquier
Дата:
On Tue, Nov 20, 2018 at 12:07:12PM +0100, Peter Eisentraut wrote:
> I went over the patch and did a bunch of small refinements.  I have also
> updated the documentation more extensively.  The attached patch is
> committable to me.

Thanks for putting energy into that.

> Recovery is now initiated by a file recovery.signal.  Standby mode is
> initiated by a file standby.signal.  The standby_mode setting is
> gone.  If a recovery.conf file is found, an error is issued.

I am having second thoughts about this part of the patch actually.
What's bad in keeping standby_mode and just rely on recovery.signal to
enforce recovery to happen?  When the startup process starts all the
parameters should be loaded.  That would also need less work from users
to switch to the new APIs.  I think that there could be a point to
*merge* hot_standby and standby_mode actually into an enum, so keeping
standby_mode would help with that (not this patch work of course).  The
barrier between recovery.trigger standby.trigger is also rather thin.

> The trigger_file setting has been renamed to promote_trigger_file as
> part of the move.

This rename looks fine.

> pg_basebackup -R now appends settings to postgresql.auto.conf and
> creates a standby.signal file.
>
> Author: Simon Riggs <simon@2ndquadrant.com>
> Author: Abhijit Menon-Sen <ams@2ndquadrant.com>
> Author: Sergei Kornilov <sk@zsrv.org>

I think that Fujii Masao should also be listed as an author, or at least
get a mention.  He was one of the first to work on this stuff.

Using separate GUCs for each target is fine by me.  I would have thought
that 003_recovery_targets.pl needed some tweaks, so I am happy to see
that it is not the case.

So overall this stuff looks fine per its shape, just the part for
standby.signal may not justify breaking compatibility more than
necessary...  The first mention of this part comes from
https://postgr.es/m/CANP8+jLoYSDjB5ip7wynVcckoE4OynHabUnB8pQMgBJgFKQpiw@mail.gmail.com
as far as I know.
--
Michael

Вложения

Re: Continue work on changes to recovery.conf API

От
Peter Eisentraut
Дата:
On 21/11/2018 07:00, Michael Paquier wrote:
> What's bad in keeping standby_mode and just rely on recovery.signal to
> enforce recovery to happen?  When the startup process starts all the
> parameters should be loaded.  That would also need less work from users
> to switch to the new APIs.  I think that there could be a point to
> *merge* hot_standby and standby_mode actually into an enum, so keeping
> standby_mode would help with that (not this patch work of course).  The
> barrier between recovery.trigger standby.trigger is also rather thin. 

This wasn't my idea, so this is just my interpretation.  The scenario
I'm wondering about is:  You have a standby.  So (under your system) you
set standby_mode=on and create recovery.trigger.  Then you promote that
standby, so recovery.trigger is removed, but standby_mode=on stays.
Much time passes.  At some point you want to do a PITR on that instance.
 So you create a recovery.trigger, set some recovery parameters.  But
you didn't notice that standby_mode=on is still set from way back when
-- and you create a mess.

One way to think about it is: Being a standby is a state, not a
configuration setting.

Btw., I'm not in love with the *.signal naming.  I originally argued
against naming them *.trigger, but I don't like the alternative either.
But that's easy to change if someone has a better idea.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Continue work on changes to recovery.conf API

От
Michael Paquier
Дата:
On Wed, Nov 21, 2018 at 12:58:19PM +0100, Peter Eisentraut wrote:
> This wasn't my idea, so this is just my interpretation.  The scenario
> I'm wondering about is:  You have a standby.  So (under your system) you
> set standby_mode=on and create recovery.trigger.  Then you promote that
> standby, so recovery.trigger is removed, but standby_mode=on stays.
> Much time passes.  At some point you want to do a PITR on that instance.
>  So you create a recovery.trigger, set some recovery parameters.  But
> you didn't notice that standby_mode=on is still set from way back when
> -- and you create a mess.
>
> One way to think about it is: Being a standby is a state, not a
> configuration setting.

Very good point, I have not thought of this problem from this
perspective.  Indeed that can be confusing.  Now things get reset
once recovery.conf is renamed.

> Btw., I'm not in love with the *.signal naming.  I originally argued
> against naming them *.trigger, but I don't like the alternative either.
> But that's easy to change if someone has a better idea.

Using "recovery" or "signal" would be more consistent with the existing
"promote" but that does not feel good either.  "trigger" does not sound
that bad...

One extra thing I was wondering is if if would make sense to rename the
signal (or trigger files) to .done once recovery is over.  That's
useful for debugging.  That could always be refined later..
--
Michael

Вложения

Re: Continue work on changes to recovery.conf API

От
Peter Eisentraut
Дата:
On 21/11/2018 07:00, Michael Paquier wrote:
>> Author: Simon Riggs <simon@2ndquadrant.com>
>> Author: Abhijit Menon-Sen <ams@2ndquadrant.com>
>> Author: Sergei Kornilov <sk@zsrv.org>
> I think that Fujii Masao should also be listed as an author, or at least
> get a mention.  He was one of the first to work on this stuff.

Committed with that addition.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Continue work on changes to recovery.conf API

От
Stephen Frost
Дата:
Greetings,

* Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote:
> Committed with that addition.

I'm concerned that this hasn't been entirely thought through regarding
pretty common use-cases, consider a pretty typical pg_basebackup -R
usage case:

- User performs a backup with pg_basebackup -R
- Replica is then promoted to be a primary
- User performs a backup with pg_basebackup -R on the new primary
- Duplicate entries end up in postgresql.auto.conf.  Ew.

Now thinking about it from the perspective of... more complete backup
solutions, which give the user the option to fill out more of what used
to go into recovery.conf, it seems like there's going to have to be a
lot more hacking around with postgresql.auto.conf, such as adding in
recovery target time and other parameters, which I'm not terribly
thrilled with.

For example, when a backup is done, typically postgresql.auto.conf comes
along with it and gets restored with it, but if that's done now without
mucking with the contents, we'll run into the same issue that
pg_basebackup has as outlined above- recovery target time and other
parameters being included in the postgresql.auto.conf, possibly multiple
times if care isn't taken to avoid that.  While it used to be that you
could end up with recovery.done files ending up in backups, they were
just a nuisance and didn't impact anything.

One possible approach for dealing with at least some of these concerns
would be to remove the recovery settings from postgresql.auto.conf once
recovery is done, which is similar to how we moved recovery.conf to
recovery.done, at least.  We could even create a recovery.done file if
we wanted to.

Unfortunately, this also means that users are going to have their own
issues when just using pg_basebackup and trying to perform PITR since
they'll be modifying postgresql.conf and will have to remember to remove
those settings since we aren't going to modify that.  Maybe we should
have a warning or notice or something saying "hey, now that this has
been promoted, maybe you should check your config file and clear out
the recovery settings?".

Thinking about how this would work with something like pgbackrest
dropping settings into postgresql.auto.conf, users would need to be
educated to use ALTER SYSTEM if they want to move the recovery target
time to be later while doing PITR (to avoid having them hacking around
in postgresql.auto.conf), which is perhaps not terrible.  What would
make that experience nicer would be a way to restart PG remotely after
making that change (well, or just making it so that users could tell PG
to continue to play forward, but as I understand it this patch doesn't
give us that).

These are my thoughts on this, at least at the moment.  I've also asked
David Steele to comment, of course, since this will impact pgbackrest.

In the end, I'm not entirely convinced that eliminating recovery.conf is
actually an improvement; I think I would have rather seen the original
approach of having an 'auto' recovery.conf, but perhaps we can improve
on this since others seemed anxious to get rid of recovery.conf (though
I'm not sure why- seems like we'll likely end up with more code to
handle things cleanly with a merged recovery.auto/postgresql.auto than
if we had kept them independent).

Thanks!

Stephen

Вложения

Re: Continue work on changes to recovery.conf API

От
Tom Lane
Дата:
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
> Committed with that addition.

Some of the buildfarm doesn't like this.  I realized that the common
denominator was EXEC_BACKEND: if you build with -DEXEC_BACKEND the
failure is completely reproducible.  Presumably something is thinking
that it will inherit some variable from the postmaster via fork.
I don't know if that's a new bug or was just exposed by a new test.

            regards, tom lane


Re: Continue work on changes to recovery.conf API

От
Andres Freund
Дата:
Hi,

On 2018-11-25 13:24:15 -0500, Stephen Frost wrote:
> - User performs a backup with pg_basebackup -R
> - Replica is then promoted to be a primary
> - User performs a backup with pg_basebackup -R on the new primary
> - Duplicate entries end up in postgresql.auto.conf.  Ew.

Why don't we put recovery entries into postgresql.recovery.conf or such?
And overwrite rather than append?


> In the end, I'm not entirely convinced that eliminating recovery.conf is
> actually an improvement; I think I would have rather seen the original
> approach of having an 'auto' recovery.conf, but perhaps we can improve
> on this since others seemed anxious to get rid of recovery.conf (though
> I'm not sure why- seems like we'll likely end up with more code to
> handle things cleanly with a merged recovery.auto/postgresql.auto than
> if we had kept them independent).

If we ever want to have more dynamic reconfiguration around recovery
(say changing the primary and other settings at runtime, and a lot of
more advanced features), we're going to need infrastructure to deal with
that. Without the merge we'd have to duplicate the guc logic.

Greetings,

Andres Freund


Re: Continue work on changes to recovery.conf API

От
Andres Freund
Дата:
On 2018-11-25 16:56:13 +0100, Peter Eisentraut wrote:
> On 21/11/2018 07:00, Michael Paquier wrote:
> >> Author: Simon Riggs <simon@2ndquadrant.com>
> >> Author: Abhijit Menon-Sen <ams@2ndquadrant.com>
> >> Author: Sergei Kornilov <sk@zsrv.org>
> > I think that Fujii Masao should also be listed as an author, or at least
> > get a mention.  He was one of the first to work on this stuff.
> 
> Committed with that addition.

Wee!

Greetings,

Andres Freund


Re: Continue work on changes to recovery.conf API

От
Stephen Frost
Дата:
Greetings,

On Sun, Nov 25, 2018 at 15:39 Andres Freund <andres@anarazel.de> wrote:
Hi,

On 2018-11-25 13:24:15 -0500, Stephen Frost wrote:
> - User performs a backup with pg_basebackup -R
> - Replica is then promoted to be a primary
> - User performs a backup with pg_basebackup -R on the new primary
> - Duplicate entries end up in postgresql.auto.conf.  Ew.

Why don't we put recovery entries into postgresql.recovery.conf or such?
And overwrite rather than append?

Sure, I think that’s more or less the same thing..?   Another included file, but one specifically for recovery bits.

> In the end, I'm not entirely convinced that eliminating recovery.conf is
> actually an improvement; I think I would have rather seen the original
> approach of having an 'auto' recovery.conf, but perhaps we can improve
> on this since others seemed anxious to get rid of recovery.conf (though
> I'm not sure why- seems like we'll likely end up with more code to
> handle things cleanly with a merged recovery.auto/postgresql.auto than
> if we had kept them independent).

If we ever want to have more dynamic reconfiguration around recovery
(say changing the primary and other settings at runtime, and a lot of
more advanced features), we're going to need infrastructure to deal with
that. Without the merge we'd have to duplicate the guc logic.

The recovery auto conf before was also just included into the config, avoiding the need to have specialized handling.  I do see the advantage of having it utilize the regular GUC system, but I don’t like losing the separation we had which allowed us to just move the whole recovery.conf to recovery.done when we finish recovery.  Seems like we could have both though if we have the recovery options in a separately included file instead of in PostgreSQL.auto.conf. 

Thanks!

Stephen

Re: Continue work on changes to recovery.conf API

От
Sergei Kornilov
Дата:
Hi

> Why don't we put recovery entries into postgresql.recovery.conf or such?
> And overwrite rather than append?
Appending to postgressql.auto.conf instead of writing new hardcoded file was proposed here:
https://www.postgresql.org/message-id/CAB7nPqQ-Grduc-0WSsK4-VXo%2B1CB%3DyDS9NMU0NEp%2Bc%2Bffnz0UQ%40mail.gmail.com

regards, Sergei


Re: Continue work on changes to recovery.conf API

От
Petr Jelinek
Дата:
Hi,

On 25/11/2018 21:48, Stephen Frost wrote:
> Greetings,
> 
> On Sun, Nov 25, 2018 at 15:39 Andres Freund <andres@anarazel.de
> <mailto:andres@anarazel.de>> wrote:
> 
>     Hi,
> 
>     On 2018-11-25 13:24:15 -0500, Stephen Frost wrote:
>     > - User performs a backup with pg_basebackup -R
>     > - Replica is then promoted to be a primary
>     > - User performs a backup with pg_basebackup -R on the new primary
>     > - Duplicate entries end up in postgresql.auto.conf.  Ew.
> 
>     Why don't we put recovery entries into postgresql.recovery.conf or such?
>     And overwrite rather than append?
> 
> 
> Sure, I think that’s more or less the same thing..?   Another included
> file, but one specifically for recovery bits.

I think the important part there is overwrite rather than append. Given
that postgresql.auto.conf is used by ALTER SYSTEM, doing overwrites
there is not as straightforward proposition (behaviorally) as doing it
in another included file.

-- 
  Petr Jelinek                  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


Re: Continue work on changes to recovery.conf API

От
Stephen Frost
Дата:
Greetings,

* Sergei Kornilov (sk@zsrv.org) wrote:
> > Why don't we put recovery entries into postgresql.recovery.conf or such?
> > And overwrite rather than append?
> Appending to postgressql.auto.conf instead of writing new hardcoded file was proposed here:
https://www.postgresql.org/message-id/CAB7nPqQ-Grduc-0WSsK4-VXo%2B1CB%3DyDS9NMU0NEp%2Bc%2Bffnz0UQ%40mail.gmail.com

Thanks, yes, I saw that but I don't think it really contemplated the
issues that arise from that approach, as I outlined.

Thanks again!

Stephen

Вложения

Re: Continue work on changes to recovery.conf API

От
Michael Paquier
Дата:
On Sun, Nov 25, 2018 at 04:56:13PM +0100, Peter Eisentraut wrote:
> Committed with that addition.

Thanks for adding that!
--
Michael

Вложения

Re: Continue work on changes to recovery.conf API

От
Peter Eisentraut
Дата:
On 25/11/2018 21:39, Andres Freund wrote:
> On 2018-11-25 13:24:15 -0500, Stephen Frost wrote:
>> - User performs a backup with pg_basebackup -R
>> - Replica is then promoted to be a primary
>> - User performs a backup with pg_basebackup -R on the new primary
>> - Duplicate entries end up in postgresql.auto.conf.  Ew.
> Why don't we put recovery entries into postgresql.recovery.conf or such?
> And overwrite rather than append?

Adding more such sub-configuration files would be easy to do and has
some merit, but the devil is in the details.  In what order would those
files be read?  Who is supposed to write to it, is it reserved for
pg_basebackup use only?  If you choose to use this particular name, is
it not used when not in recovery?  Does this file belong in the data
directory or the configuration directory?  Which choice will offend
packagers the least?  Etc.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Continue work on changes to recovery.conf API

От
David Steele
Дата:
On 11/27/18 3:59 AM, Peter Eisentraut wrote:
> On 25/11/2018 21:39, Andres Freund wrote:
>> On 2018-11-25 13:24:15 -0500, Stephen Frost wrote:
>>> - User performs a backup with pg_basebackup -R
>>> - Replica is then promoted to be a primary
>>> - User performs a backup with pg_basebackup -R on the new primary
>>> - Duplicate entries end up in postgresql.auto.conf.  Ew.
>> Why don't we put recovery entries into postgresql.recovery.conf or such?
>> And overwrite rather than append?
> 
> Adding more such sub-configuration files would be easy to do and has
> some merit, but the devil is in the details.  In what order would those
> files be read?  Who is supposed to write to it, is it reserved for
> pg_basebackup use only?  If you choose to use this particular name, is
> it not used when not in recovery?  Does this file belong in the data
> directory or the configuration directory?  Which choice will offend
> packagers the least?  Etc.

I would prefer a specific file that will be auto-included into
postgresql.conf when present but be ignored when not present.  Some
settings are generally ephemeral (recovery_target_time) and it would be
nice for them to go away.  When recovery is complete the file would be
renamed to .done just as recovery.conf is now.

I had been thinking about keeping the recovery.conf file name but on
reflection is seems best to make a clean break.  I like Andres'
suggestion of postgresql.recovery.conf.

I would not be in favor of this file being for pg_basebackup only.  If
it has documented and consistent behavior then any restore solution
should be able to use it.

To be clear, the aim is to give tools a place to write recovery GUCs
that are considered temporary and/or replaceable.  Users are still free
to write permanent recovery GUCs into whatever file they would like,
e.g. restore_command.

Regards,
-- 
-David
david@pgmasters.net


Re: Continue work on changes to recovery.conf API

От
Peter Eisentraut
Дата:
On 27/11/2018 13:21, David Steele wrote:
> I would prefer a specific file that will be auto-included into
> postgresql.conf when present but be ignored when not present.  Some
> settings are generally ephemeral (recovery_target_time) and it would be
> nice for them to go away.  When recovery is complete the file would be
> renamed to .done just as recovery.conf is now.

That might be a useful facility, but it wouldn't really address the
pg_basebackup -R issue, because that creates settings that you don't
want going away in this manner.  You'd then need two separate such
files, one for targeted recovery that goes away when recovery ends, and
one that is automatically included that pg_basebackup can overwrite at will.

-- 
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: Continue work on changes to recovery.conf API

От
Stephen Frost
Дата:
Greetings,

* Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote:
> On 27/11/2018 13:21, David Steele wrote:
> > I would prefer a specific file that will be auto-included into
> > postgresql.conf when present but be ignored when not present.  Some
> > settings are generally ephemeral (recovery_target_time) and it would be
> > nice for them to go away.  When recovery is complete the file would be
> > renamed to .done just as recovery.conf is now.
>
> That might be a useful facility, but it wouldn't really address the
> pg_basebackup -R issue, because that creates settings that you don't
> want going away in this manner.  You'd then need two separate such
> files, one for targeted recovery that goes away when recovery ends, and
> one that is automatically included that pg_basebackup can overwrite at will.

I've been thinking about this also and I agree that there's some
challenges when it comes to having another file- what happens if someone
does an ALTER SYSTEM on primary_conninfo while it's in the
'recovery.auto.conf' (or whatever) file?  Does that go into
postgresql.auto.conf, or somewhere else?  What if they do a RESET?

Then there's the other side of things- do we really want things like
recovery_target_time being kept around in postgresql.auto.conf after a
promotoion?  Do we want to keep appending primary_conninfo into
postgresql.auto.conf?  I haven't looked but I'm also concerned that
something like ALTER SYSTEM RESET isn't really prepared to find
duplicates in postgresql.auto.conf...

Maybe thinking through what we want to have happen in each scenario
would be good.  If you perform a pg_basebackup -R and there's already
something set in postgresql.auto.conf for primary conninfo- what should
happen?  If we reach the end of recovery and promote while
postgresql.auto.conf has recovery_target_time set, what should happen?
If third-party tools want to do what pg_basebackup -R does and modify
things in postgresql.auto.conf, how should they do that?

Here's my thoughts on answers to the above:

- pg_basebackup -R should INSERT .. ON CONFLICT UPDATE the settings that
  it wants to set in postgresql.auto.conf

- When we reach the end of recovery and promote, we should DELETE from
  the postgresql.auto.conf the recovery target settings.

- External tools should either be told that they can modify
  postgresql.auto.conf and given guideance on how to do so, or we should
  provide a tool which allows them to do so (or maybe both).

As we already have a section for recovery target settings that clearly
has them as independent, hopefully this will make sense to users.  Open
to other ideas too, of course, but I don't think we can continue to just
append things to the end of postgresql.auto.conf when pg_basebackup is
run with -R.

Thanks!

Stephen

Вложения

Re: Continue work on changes to recovery.conf API

От
Andres Freund
Дата:
Hi,

On 2018-11-27 15:36:59 +0100, Peter Eisentraut wrote:
> That might be a useful facility, but it wouldn't really address the
> pg_basebackup -R issue, because that creates settings that you don't
> want going away in this manner.

Why is that / which are those?  It's not like it worked like that <
2dedf4d9.

Greetings,

Andres Freund


Re: Continue work on changes to recovery.conf API

От
Stephen Frost
Дата:
Greetings,

* Andres Freund (andres@anarazel.de) wrote:
> On 2018-11-27 15:36:59 +0100, Peter Eisentraut wrote:
> > That might be a useful facility, but it wouldn't really address the
> > pg_basebackup -R issue, because that creates settings that you don't
> > want going away in this manner.
>
> Why is that / which are those?  It's not like it worked like that <
> 2dedf4d9.

primary_conninfo and restore_command are the main ones, and the idea is
that you'd like those to be specified even when you aren't in recovery,
so that you can have a symmetric config for when a given system does
become a replica.

Thanks!

Stephen

Вложения

Re: Re: Continue work on changes to recovery.conf API

От
David Steele
Дата:
On 11/27/18 4:36 PM, Peter Eisentraut wrote:
> On 27/11/2018 13:21, David Steele wrote:
>> I would prefer a specific file that will be auto-included into
>> postgresql.conf when present but be ignored when not present.  Some
>> settings are generally ephemeral (recovery_target_time) and it would be
>> nice for them to go away.  When recovery is complete the file would be
>> renamed to .done just as recovery.conf is now.
> 
> That might be a useful facility, but it wouldn't really address the
> pg_basebackup -R issue, because that creates settings that you don't
> want going away in this manner.  You'd then need two separate such
> files, one for targeted recovery that goes away when recovery ends, and
> one that is automatically included that pg_basebackup can overwrite at will.

I'm not sure why that's true.  Some settings you might prefer to be 
persistent and others not.  If pg_basebackup -R is treating them the 
same then that seems like a limitation.

Endlessly appending settings into postgresql.auto.conf seems confusing 
even if it makes sense to postgres (i.e. last setting wins).

Regards,
-- 
-David
david@pgmasters.net


Re: Re: Continue work on changes to recovery.conf API

От
David Steele
Дата:
Hi Peter,

On 11/27/18 5:34 PM, Stephen Frost wrote:
> Greetings,
> 
> * Peter Eisentraut (peter.eisentraut@2ndquadrant.com) wrote:
>> On 27/11/2018 13:21, David Steele wrote:
>>> I would prefer a specific file that will be auto-included into
>>> postgresql.conf when present but be ignored when not present.  Some
>>> settings are generally ephemeral (recovery_target_time) and it would be
>>> nice for them to go away.  When recovery is complete the file would be
>>> renamed to .done just as recovery.conf is now.
>>
>> That might be a useful facility, but it wouldn't really address the
>> pg_basebackup -R issue, because that creates settings that you don't
>> want going away in this manner.  You'd then need two separate such
>> files, one for targeted recovery that goes away when recovery ends, and
>> one that is automatically included that pg_basebackup can overwrite at will.
> 
> I've been thinking about this also and I agree that there's some
> challenges when it comes to having another file- what happens if someone
> does an ALTER SYSTEM on primary_conninfo while it's in the
> 'recovery.auto.conf' (or whatever) file?  Does that go into
> postgresql.auto.conf, or somewhere else?  What if they do a RESET?
> 
> Then there's the other side of things- do we really want things like
> recovery_target_time being kept around in postgresql.auto.conf after a
> promotoion?  Do we want to keep appending primary_conninfo into
> postgresql.auto.conf?  I haven't looked but I'm also concerned that
> something like ALTER SYSTEM RESET isn't really prepared to find
> duplicates in postgresql.auto.conf...
> 
> Maybe thinking through what we want to have happen in each scenario
> would be good.  If you perform a pg_basebackup -R and there's already
> something set in postgresql.auto.conf for primary conninfo- what should
> happen?  If we reach the end of recovery and promote while
> postgresql.auto.conf has recovery_target_time set, what should happen?
> If third-party tools want to do what pg_basebackup -R does and modify
> things in postgresql.auto.conf, how should they do that?
> 
> Here's my thoughts on answers to the above:
> 
> - pg_basebackup -R should INSERT .. ON CONFLICT UPDATE the settings that
>    it wants to set in postgresql.auto.conf
> 
> - When we reach the end of recovery and promote, we should DELETE from
>    the postgresql.auto.conf the recovery target settings.
> 
> - External tools should either be told that they can modify
>    postgresql.auto.conf and given guideance on how to do so, or we should
>    provide a tool which allows them to do so (or maybe both).
> 
> As we already have a section for recovery target settings that clearly
> has them as independent, hopefully this will make sense to users.  Open
> to other ideas too, of course, but I don't think we can continue to just
> append things to the end of postgresql.auto.conf when pg_basebackup is
> run with -R.

I'd be interested to get your take to these questions.

Just about every third-party backup and HA tool out there writes 
recovery.conf files and automates restores.  This is a huge change.

I personally would prefer to have something like 
postgresql.recovery.conf file that is automatically included if it is 
present.  This simplifies the issue of how to maintain recovery settings 
in postgresql.auto.conf.  The file could be renamed to 
postgresql.recovery.conf.done similar to how recovery.conf is now.

Of course, some settings like primary_conninfo could just stay in 
postgresql.conf or postgresql.auto.conf since they are far less subject 
to change.  Or they could be in postgresql.recovery.conf for HA 
environments.

I don't see why this won't work with pg_basebackup -R -- it just may be 
the case that some settings get overridden.  In HA scenarios it's hard 
to see how pg_basebackup would have the correct info for something like 
primary_conninfo anyway.

I like the flexibility of recovery options being available as GUCs but 
I'm not sure the ramifications of these changes have been completely 
thought through.

Regards,
-- 
-David
david@pgmasters.net