Обсуждение: recovery_target_action=pause with confusing hint

Поиск
Список
Период
Сортировка

recovery_target_action=pause with confusing hint

От
Sergei Kornilov
Дата:
Hello

Currently during point-in-time recovery with recovery_target_action = 'pause' we print log lines:

> LOG: recovery has paused
> HINT: Execute pg_wal_replay_resume() to continue.

My colleague told me that this is a terrible moment: to continue what exactly? It sounds like "to continue replay",
similarto normal pg_wal_replay_pause/pg_wal_replay_resume behavior. We have just small note in documentation:
 

> The paused state can be resumed by using pg_wal_replay_resume() (see Table 9.81), which then causes recovery to end.

But I think this is important place and can be improved.

Also the database does not respond to the promote signals at this stage. Attached patch 0001 with the test will fail.

0002 patch contains my proposed ideas:
- introduce separate message for pause due pg_wal_replay_pause call and for recovery_target_action.
- check for standby triggers only for recovery_target_action - I am not sure this would be safe for
pg_wal_replay_pause()call case
 

Maybe more verbose hint would be appropriate:

> Execute pg_promote() to end recovery or shut down the server, change the recovery target settings to a later target
andrestart to continue recovery
 

Thoughts?

regards, Sergei
Вложения

Re: recovery_target_action=pause with confusing hint

От
Fujii Masao
Дата:

On 2020/01/30 20:00, Sergei Kornilov wrote:
> Hello
> 
> Currently during point-in-time recovery with recovery_target_action = 'pause' we print log lines:
> 
>> LOG: recovery has paused
>> HINT: Execute pg_wal_replay_resume() to continue.
> 
> My colleague told me that this is a terrible moment: to continue what exactly? It sounds like "to continue replay",
similarto normal pg_wal_replay_pause/pg_wal_replay_resume behavior. We have just small note in documentation:
 
> 
>> The paused state can be resumed by using pg_wal_replay_resume() (see Table 9.81), which then causes recovery to
end.
> 
> But I think this is important place and can be improved.
> 
> Also the database does not respond to the promote signals at this stage. Attached patch 0001 with the test will
fail.
> 
> 0002 patch contains my proposed ideas:
> - introduce separate message for pause due pg_wal_replay_pause call and for recovery_target_action.

+1

> - check for standby triggers only for recovery_target_action - I am not sure this would be safe for
pg_wal_replay_pause()call case
 

Agreed. Basically I think that recoveryPausesHere() should the promotion
trigger whether recovery target is reached or not. But one question is;
how should the recovery behave if recovery target is reached with
recovery_target_action=pause after the promotion is requested?
It should pause? Or promote?

Regards,

-- 
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters



Re: recovery_target_action=pause with confusing hint

От
David Steele
Дата:
Hi Sergei,

On 1/30/20 12:00 PM, Fujii Masao wrote:
> 
>> - check for standby triggers only for recovery_target_action - I am 
>> not sure this would be safe for pg_wal_replay_pause() call case
> 
> Agreed. Basically I think that recoveryPausesHere() should the promotion
> trigger whether recovery target is reached or not. But one question is;
> how should the recovery behave if recovery target is reached with
> recovery_target_action=pause after the promotion is requested?
> It should pause? Or promote?

Do you have thoughts on Fujii's comments?

Regards,
-- 
-David
david@pgmasters.net



Re: recovery_target_action=pause with confusing hint

От
Fujii Masao
Дата:

On 2020/03/09 21:30, David Steele wrote:
> Hi Sergei,
> 
> On 1/30/20 12:00 PM, Fujii Masao wrote:
>>
>>> - check for standby triggers only for recovery_target_action - I am not sure this would be safe for
pg_wal_replay_pause()call case
 
>>
>> Agreed. Basically I think that recoveryPausesHere() should the promotion
>> trigger whether recovery target is reached or not. But one question is;
>> how should the recovery behave if recovery target is reached with
>> recovery_target_action=pause after the promotion is requested?
>> It should pause? Or promote?
> 
> Do you have thoughts on Fujii's comments?

Thanks for the ping! And sorry for not reporting the current status.

I started the discussion about the topic very related to this.
I'm thinking to apply the change that Sergei proposes after applying
the patch I attached in this thread.
https://postgr.es/m/00c194b2-dbbb-2e8a-5b39-13f14048ef0a@oss.nttdata.com

Regards,

-- 
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters



Re: recovery_target_action=pause with confusing hint

От
Robert Haas
Дата:
On Mon, Mar 9, 2020 at 12:03 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
> I started the discussion about the topic very related to this.
> I'm thinking to apply the change that Sergei proposes after applying
> the patch I attached in this thread.
> https://postgr.es/m/00c194b2-dbbb-2e8a-5b39-13f14048ef0a@oss.nttdata.com

I think it would be good to change the primary message, not just the hint.  e.g.

LOG: pausing at end of recovery
HINT: Execute pg_wal_replay_resume() to promote.

vs.

LOG: recovery has paused
HINT: Execute pg_wal_replay_resume() to continue.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: recovery_target_action=pause with confusing hint

От
Fujii Masao
Дата:

On 2020/03/10 2:27, Robert Haas wrote:
> On Mon, Mar 9, 2020 at 12:03 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>> I started the discussion about the topic very related to this.
>> I'm thinking to apply the change that Sergei proposes after applying
>> the patch I attached in this thread.
>> https://postgr.es/m/00c194b2-dbbb-2e8a-5b39-13f14048ef0a@oss.nttdata.com
> 
> I think it would be good to change the primary message, not just the hint.  e.g.
> 
> LOG: pausing at end of recovery
> HINT: Execute pg_wal_replay_resume() to promote.
> 
> vs.
> 
> LOG: recovery has paused
> HINT: Execute pg_wal_replay_resume() to continue.

Looks good to me.
Attached is the updated version of the patch.
This is based on Sergei's patch.

Regards,


-- 
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters

Вложения

Re: recovery_target_action=pause with confusing hint

От
movead li
Дата:
Hello,

When I test the patch, I find an issue: I start a stream with 'promote_trigger_file'
GUC valid, and exec pg_wal_replay_pause() during recovery and as below it
shows success to pause at the first time. I think it use a initialize
'SharedPromoteIsTriggered' value first time I exec the pg_wal_replay_pause().

#####################################
postgres=# select pg_wal_replay_pause();
 pg_wal_replay_pause 
---------------------
 
(1 row)

postgres=# select pg_wal_replay_pause();
ERROR:  standby promotion is ongoing
HINT:  pg_wal_replay_pause() cannot be executed after promotion is triggered.
postgres=# select pg_wal_replay_pause();
ERROR:  recovery is not in progress
HINT:  Recovery control functions can only be executed during recovery.
postgres=#
##############################################################

The new status of this patch is: Waiting on Author

Re: recovery_target_action=pause with confusing hint

От
Sergei Kornilov
Дата:
Hello

> When I test the patch, I find an issue: I start a stream with 'promote_trigger_file'
> GUC valid, and exec pg_wal_replay_pause() during recovery and as below it
> shows success to pause at the first time. I think it use a initialize
> 'SharedPromoteIsTriggered' value first time I exec the pg_wal_replay_pause().

hm. Are you sure this is related to this patch? Could you explain the exact timing? I mean log_statement = all and
relevantlogs.
 
Most likely this is expected change by
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=496ee647ecd2917369ffcf1eaa0b2cdca07c8730

My proposal does not change the behavior after this commit, only changing the lines in the logs.

regards, Sergei



Re: recovery_target_action=pause with confusing hint

От
Fujii Masao
Дата:

On 2020/03/31 18:59, Sergei Kornilov wrote:
> Hello
> 
>> When I test the patch, I find an issue: I start a stream with 'promote_trigger_file'
>> GUC valid, and exec pg_wal_replay_pause() during recovery and as below it
>> shows success to pause at the first time. I think it use a initialize
>> 'SharedPromoteIsTriggered' value first time I exec the pg_wal_replay_pause().
> 
> hm. Are you sure this is related to this patch? Could you explain the exact timing? I mean log_statement = all and
relevantlogs.
 
> Most likely this is expected change by
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=496ee647ecd2917369ffcf1eaa0b2cdca07c8730

Yeah, if the problem exists there, that would be my fault in the commit.
movead li, could you share the detail procedure to reproduce the issue?
I'm happy to investigate it.

> My proposal does not change the behavior after this commit, only changing the lines in the logs.

Yes. What's your opinion about the latest patch based on Robert's idea?
Barring any ojection, I'd like to commit that.

Regards,

-- 
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters



Re: recovery_target_action=pause with confusing hint

От
Sergei Kornilov
Дата:
Hello

>>  My proposal does not change the behavior after this commit, only changing the lines in the logs.
>
> Yes. What's your opinion about the latest patch based on Robert's idea?
> Barring any ojection, I'd like to commit that.

Oh, sorry. Looks good and solves my issue

regards, Sergei



Re: recovery_target_action=pause with confusing hint

От
Fujii Masao
Дата:

On 2020/03/31 19:33, Sergei Kornilov wrote:
> Hello
> 
>>>   My proposal does not change the behavior after this commit, only changing the lines in the logs.
>>
>> Yes. What's your opinion about the latest patch based on Robert's idea?
>> Barring any ojection, I'd like to commit that.
> 
> Oh, sorry. Looks good and solves my issue

Thanks for reviewing the patch! Pushed!

Regards,

-- 
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters



Re: recovery_target_action=pause with confusing hint

От
Sergei Kornilov
Дата:
>>>>    My proposal does not change the behavior after this commit, only changing the lines in the logs.
>>>
>>>  Yes. What's your opinion about the latest patch based on Robert's idea?
>>>  Barring any ojection, I'd like to commit that.
>>
>>  Oh, sorry. Looks good and solves my issue
>
> Thanks for reviewing the patch! Pushed!

Thank you!

regards, Sergei



Re: recovery_target_action=pause with confusing hint

От
"movead.li@highgo.ca"
Дата:

>> When I test the patch, I find an issue: I start a stream with 'promote_trigger_file'
>> GUC valid, and exec pg_wal_replay_pause() during recovery and as below it
>> shows success to pause at the first time. I think it use a initialize
>> 'SharedPromoteIsTriggered' value first time I exec the pg_wal_replay_pause().
 
>hm. Are you sure this is related to this patch? Could you explain the exact timing? I mean log_statement = all and relevant logs.
>Most likely this is expected change by https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=496ee647ecd2917369ffcf1eaa0b2cdca07c8730 
>My proposal does not change the behavior after this commit, only changing the lines in the logs.
 
I test it again with (92d31085e926253aa650b9d1e1f2f09934d0ddfc), and the
issue appeared again. Here is my test method which quite simple:
1. Setup a base backup by pg_basebackup.
2. Insert lots of data in master for the purpose I have enough time to exec
   pg_wal_replay_pause() when startup the replication.
3. Configure the 'promote_trigger_file' GUC and create the trigger file.
4. Start the backup(standby), connect it immediately, and exec pg_wal_replay_pause()
Then it appears, and a test log attached.

I means when I exec the pg_wal_replay_pause() first time, nobody has check the trigger state
by CheckForStandbyTrigger(), it use a Initialized 'SharedPromoteIsTriggered' value. 
And patch attached can solve the issue.





Regards,
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
EMAIL: mailto:movead(dot)li(at)highgo(dot)ca

Вложения

Re: recovery_target_action=pause with confusing hint

От
Fujii Masao
Дата:

On 2020/04/01 11:42, movead.li@highgo.ca wrote:
> 
>>> When I test the patch, I find an issue: I start a stream with 'promote_trigger_file'
>> > GUC valid, and exec pg_wal_replay_pause() during recovery and as below it
>  >> shows success to pause at the first time. I think it use a initialize
>  >> 'SharedPromoteIsTriggered' value first time I exec the pg_wal_replay_pause().
>>hm. Are you sure this is related to this patch? Could you explain the exact timing? I mean log_statement = all and
relevantlogs.
 
>>Most likely this is expected change by
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=496ee647ecd2917369ffcf1eaa0b2cdca07c8730
>>My proposal does not change the behavior after this commit, only changing the lines in the logs.
> I test it again with (92d31085e926253aa650b9d1e1f2f09934d0ddfc), and the
> issue appeared again. Here is my test method which quite simple:
> 1. Setup a base backup by pg_basebackup.
> 2. Insert lots of data in master for the purpose I have enough time to exec
>     pg_wal_replay_pause() when startup the replication.
> 3. Configure the 'promote_trigger_file' GUC and create the trigger file.
> 4. Start the backup(standby), connect it immediately, and exec pg_wal_replay_pause()
> Then it appears, and a test log attached.
> 
> I means when I exec the pg_wal_replay_pause() first time, nobody has check the trigger state
> by CheckForStandbyTrigger(), it use a Initialized 'SharedPromoteIsTriggered' value.
> And patch attached can solve the issue.

Thanks for the explanation!

But, sorry,,, I failed to understand the issue that you reported, yet...
You mean that the first call of pg_wal_replay_pause() in the step #2
should check whether the trigger file exists or not? If so, could you
tell me why we should do that?

BTW, right now only the startup process is allowed to call
CheckForStandbyTrigger(). So the backend process calling
pg_wal_replay_pause() and PromoteIsTriggered() is not allowed to call
CheckForStandbyTrigger(). The current logic is that the startup process
is responsible for checking the trigger file and set the flag in the shmem
if promotion is triggered. Then other processes like backend know
whether promotion is ongoing or not from the shmem. So basically
the backend doesn't need to check the trigger file itself.

Regards,

-- 
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters



Re: Re: recovery_target_action=pause with confusing hint

От
"movead.li@highgo.ca"
Дата:

>But, sorry,,, I failed to understand the issue that you reported, yet....
>You mean that the first call of pg_wal_replay_pause() in the step #2
>should check whether the trigger file exists or not? If so, could you
>tell me why we should do that?
Sorry about my pool english.  The 'pg_wal_replay_pause()' is executed
in step 4. I mention it in step 2 is for explain why it need lots of insert
data. 

>BTW, right now only the startup process is allowed to call
>CheckForStandbyTrigger(). So the backend process calling
>pg_wal_replay_pause() and PromoteIsTriggered() is not allowed to call
>CheckForStandbyTrigger(). The current logic is that the startup process
>is responsible for checking the trigger file and set the flag in the shmem
It's here, startup process does not call CheckForStandbyTrigger() to check
the trigger file until a pg_wal_replay_pause() or PromoteIsTriggered() comes,
so first time to call the pg_wal_replay_pause(), it use a wrong 
'SharedPromoteIsTriggered' value.


>if promotion is triggered. Then other processes like backend know
>whether promotion is ongoing or not from the shmem. So basically
>the backend doesn't need to check the trigger file itself.
If backend is not allowed to call CheckForStandbyTrigger(), then you should
find a way to hold it.
In another word, during the recovery if I add the trigger file, the starup process
do not know it at all until after a pg_wal_replay_pause() come.

In addition, although the first time I exec 'pg_wal_replay_pause' it shows success,
the startup process is keeping redo(no stop). 



Regards,
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
EMAIL: mailto:movead(dot)li(at)highgo(dot)ca

Re: recovery_target_action=pause with confusing hint

От
Fujii Masao
Дата:

On 2020/04/01 16:53, movead.li@highgo.ca wrote:
> 
>>But, sorry,,, I failed to understand the issue that you reported, yet....
>>You mean that the first call of pg_wal_replay_pause() in the step #2
>>should check whether the trigger file exists or not? If so, could you
>>tell me why we should do that?
> Sorry about my pool english.  The 'pg_wal_replay_pause()' is executed
> in step 4. I mention it in step 2 is for explain why it need lots of insert
> data.
> 
>>BTW, right now only the startup process is allowed to call
>>CheckForStandbyTrigger(). So the backend process calling
>>pg_wal_replay_pause() and PromoteIsTriggered() is not allowed to call
>>CheckForStandbyTrigger(). The current logic is that the startup process
>>is responsible for checking the trigger file and set the flag in the shmem
> It's here, startup process does not call CheckForStandbyTrigger() to check
> the trigger file until a pg_wal_replay_pause() or PromoteIsTriggered() comes,
> so first time to call the pg_wal_replay_pause(), it use a wrong
> 'SharedPromoteIsTriggered' value.
> 
> 
>>if promotion is triggered. Then other processes like backend know
>>whether promotion is ongoing or not from the shmem. So basically
>>the backend doesn't need to check the trigger file itself.
> If backend is not allowed to call CheckForStandbyTrigger(), then you should
> find a way to hold it.
> In another word, during the recovery if I add the trigger file, the starup process
> do not know it at all until after a pg_wal_replay_pause() come.

Thanks for the explanation again! Maybe I understand your point.

As far as I read the code, in the standby mode, the startup process
periodically checks the trigger file in WaitForWALToBecomeAvailable().
No?

There can be small delay between the creation of the trigger file
and the periodic call to CheckForStandbyTrigger() by the startup process.
If you execute pg_wal_replay_pause() during that delay, it would suceed.

But you'd like to get rid of that delay completely? In other words,
both the startup process and the backend calling pg_wal_replay_pause()
should detect the existence of the trigger file immdiately after
it's created?

Regards,

-- 
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters



Re: recovery_target_action=pause with confusing hint

От
"movead.li@highgo.ca"
Дата:
>Thanks for the explanation again! Maybe I understand your point.
Great.

>As far as I read the code, in the standby mode, the startup process
>periodically checks the trigger file in WaitForWALToBecomeAvailable().
>No?
Yes it is. 

>There can be small delay between the creation of the trigger file
>and the periodic call to CheckForStandbyTrigger() by the startup process.
>If you execute pg_wal_replay_pause() during that delay, it would suceed.
If there be a huge number of wal segments need a standby to rewind,
then it can not be a 'small delay'.

>But you'd like to get rid of that delay completely? In other words,
>both the startup process and the backend calling pg_wal_replay_pause()
>should detect the existence of the trigger file immdiately after
>it's created?
I want to point out the thing, the pg_wal_replay_pause() shows success but
the startup process keeping redo, it may cause something confused. So it's
better to solve the issue.




Regards,
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
EMAIL: mailto:movead(dot)li(at)highgo(dot)ca

Re: recovery_target_action=pause with confusing hint

От
Fujii Masao
Дата:

On 2020/04/01 17:58, movead.li@highgo.ca wrote:
>>Thanks for the explanation again! Maybe I understand your point.
> Great.
> 
>>As far as I read the code, in the standby mode, the startup process
>>periodically checks the trigger file in WaitForWALToBecomeAvailable().
>>No?
> Yes it is.
> 
>>There can be small delay between the creation of the trigger file
>>and the periodic call to CheckForStandbyTrigger() by the startup process.
>>If you execute pg_wal_replay_pause() during that delay, it would suceed.
> If there be a huge number of wal segments need a standby to rewind,
> then it can not be a 'small delay'.

Yeah, that's true.

>>But you'd like to get rid of that delay completely? In other words,
>>both the startup process and the backend calling pg_wal_replay_pause()
>>should detect the existence of the trigger file immdiately after
>>it's created?
> I want to point out the thing, the pg_wal_replay_pause() shows success but
> the startup process keeping redo, it may cause something confused. So it's
> better to solve the issue.

This happens because the startup process detects the trigger file
after pg_wal_replay_pause() succeeds, and then make the recovery
get out of the paused state. It might be problematic to end the paused
state silently? So, to make the situation less confusing, what about
emitting a log message when ending the paused state because of
the promotion?

Regards,

-- 
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters



Re: recovery_target_action=pause with confusing hint

От
"movead.li@highgo.ca"
Дата:

>This happens because the startup process detects the trigger file
>after pg_wal_replay_pause() succeeds, and then make the recovery
>get out of the paused state. 
Yes that is.

>It might be problematic to end the paused
>state silently? So, to make the situation less confusing, what about
>emitting a log message when ending the paused state because of
>the promotion?
But where to emit it? I think it not so good by emitting to log file,
because the user will not check it everytime. BTW, why
CheckForStandbyTrigger() can not be called in backend.


Regards,
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
EMAIL: mailto:movead(dot)li(at)highgo(dot)ca

Re: recovery_target_action=pause with confusing hint

От
Fujii Masao
Дата:

On 2020/04/01 18:56, movead.li@highgo.ca wrote:
> 
>  >This happens because the startup process detects the trigger file
>>after pg_wal_replay_pause() succeeds, and then make the recovery
>>get out of the paused state.
> Yes that is.
> 
>  >It might be problematic to end the paused
>>state silently? So, to make the situation less confusing, what about
>>emitting a log message when ending the paused state because of
>>the promotion?
> But where to emit it? I think it not so good by emitting to log file,
> because the user will not check it everytime.

Yeah, I'm thinking to emit the message to log file, like the startup process
does in other places :)

> BTW, why
> CheckForStandbyTrigger() can not be called in backend.

Because

1) promote_signaled flag that IsPromoteSignaled() sees is set
      only in the startup process
2) the trigger file can be removed or promote_trigger_file can be
      changed after the backend detects it but before the startup process
      does. That is, the result of the trigger file detection can be different
      between the processes.

Of course we can change CheckForStandbyTrigger() so that it can
be called by backends, but I'm not sure if it's worth doing that.

Or another idea to reduce the delay between the request for
the promotion and the detection of it is to make the startup process
call CheckForStandbyTrigger() more frequently. Calling that every
replay of WAL record would be overkill and decrease the recovery
performance. Calling thst every WAL file might be ok. I'm not sure
if this is really improvement or not, though...

Regards,

-- 
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters



Re: recovery_target_action=pause with confusing hint

От
Fujii Masao
Дата:

On 2020/04/01 19:37, Fujii Masao wrote:
> 
> 
> On 2020/04/01 18:56, movead.li@highgo.ca wrote:
>>
>>  >This happens because the startup process detects the trigger file
>>> after pg_wal_replay_pause() succeeds, and then make the recovery
>>> get out of the paused state.
>> Yes that is.
>>
>>  >It might be problematic to end the paused
>>> state silently? So, to make the situation less confusing, what about
>>> emitting a log message when ending the paused state because of
>>> the promotion?
>> But where to emit it? I think it not so good by emitting to log file,
>> because the user will not check it everytime.
> 
> Yeah, I'm thinking to emit the message to log file, like the startup process
> does in other places :)

So I'd like to propose the attached patch. The patch changes the message
logged when a promotion is requested, based on whether the recovery is
in paused state or not.

Regards,

-- 
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters

Вложения

Re: recovery_target_action=pause with confusing hint

От
"movead.li@highgo.ca"
Дата:

>So I'd like to propose the attached patch. The patch changes the message
>logged when a promotion is requested, based on whether the recovery is
>in paused state or not.
It is a compromise, we should notice it in document I think.



Regards,
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
EMAIL: mailto:movead(dot)li(at)highgo(dot)ca

Re: recovery_target_action=pause with confusing hint

От
Fujii Masao
Дата:

On 2020/04/02 10:41, movead.li@highgo.ca wrote:
> 
>  >So I'd like to propose the attached patch. The patch changes the message
>>logged when a promotion is requested, based on whether the recovery is
>>in paused state or not.
> It is a compromise,

Ok, so barring any objection, I will commit the patch.

> we should notice it in document I think.

There is the following explation about the relationship the recovery
pause and the promotion, in the document. You may want to add more
descriptions into the docs?

------------------------------
If a promotion is triggered while recovery is paused, the paused
state ends and a promotion continues.
------------------------------

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION



Re: recovery_target_action=pause with confusing hint

От
"movead.li@highgo.ca"
Дата:

>> we should notice it in document I think. 
>There is the following explation about the relationship the recovery
>pause and the promotion, in the document. You may want to add more
>descriptions into the docs?
>------------------------------
>If a promotion is triggered while recovery is paused, the paused
>state ends and a promotion continues.
>------------------------------

For example we can add this words:
First-time pg_wal_replay_pause() called during recovery which triggered
as promotion, pg_wal_replay_pause() show success but it did not really
pause the recovery.



Regards,
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
EMAIL: mailto:movead(dot)li(at)highgo(dot)ca

Re: recovery_target_action=pause with confusing hint

От
Fujii Masao
Дата:

On 2020/04/08 16:41, movead.li@highgo.ca wrote:
> 
>  >> we should notice it in document I think.
>>There is the following explation about the relationship the recovery
>>pause and the promotion, in the document. You may want to add more
>>descriptions into the docs?
>  >------------------------------
>>If a promotion is triggered while recovery is paused, the paused
>>state ends and a promotion continues.
>>------------------------------
> 
> For example we can add this words:
> First-time pg_wal_replay_pause() called during recovery which triggered
> as promotion, pg_wal_replay_pause() show success but it did not really
> pause the recovery.

I think this is not true. In your case, pg_wal_replay_pause() succeeded
because the startup process had not detected the promotion request yet
at that moment.

To cover your case, what about adding the following description?

-----------------
There can be delay between a promotion request by users and the trigger of
a promotion in the server. Note that pg_wal_replay_pause() succeeds
during that delay, i.e., until a promotion is actually triggered.
-----------------

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION



Re: recovery_target_action=pause with confusing hint

От
"movead.li@highgo.ca"
Дата:

>To cover your case, what about adding the following description?
 
>-----------------
>There can be delay between a promotion request by users and the trigger of
>a promotion in the server. Note that pg_wal_replay_pause() succeeds
>during that delay, i.e., until a promotion is actually triggered.
>-----------------
Yes that's it.



Regards,
Highgo Software (Canada/China/Pakistan)
URL : www.highgo.ca
EMAIL: mailto:movead(dot)li(at)highgo(dot)ca