Re: Changes to recovery_min_apply_delay are ignored while waiting for delay
От | Soumyadeep Chakraborty |
---|---|
Тема | Re: Changes to recovery_min_apply_delay are ignored while waiting for delay |
Дата | |
Msg-id | CAE-ML+8mB1R2_8ZVOFYROLFd=SdQXDtr-6aSxghUPkH+sUgkbw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Changes to recovery_min_apply_delay are ignored while waiting for delay (Michael Paquier <michael@paquier.xyz>) |
Ответы |
Re: Changes to recovery_min_apply_delay are ignored while waiting for delay
|
Список | pgsql-hackers |
Hey Michael, Really appreciate the review! On Wed, Aug 11, 2021 at 12:40 AM Michael Paquier <michael@paquier.xyz> wrote: > Agreed that the current behavior is confusing. As you are using the > commit timestamp for the comparison, this is right. One small-ish > comment I have about the code is that we should mention > recovery_min_apply_delay for HandleStartupProcInterrupts(), and not > only the trigger file location. Cool, updated. > +# First, set the delay for the next commit to some obscenely high value. > It has no need to be obscenely high, just high enough to give the time > to slow machines to catch the wait event lookup done. So this could > use better words to explain this choice. Sounds good. Done. > Anyway, it seems to me that something is incorrect in this new test > (manual tests pass here, the TAP test is off). One thing we need to > make sure for any new test added is that it correctly breaks if the > fix proposed is *not* in place. And as far as I can see, the test > passes even if the recalculation of delayUntil is done within the loop > that reloads the configuration. The thing may be a bit tricky to make > reliable as the parameter reloading may cause wait_event to not be > RecoveryApplyDelay, so we should have at least a check based on a scan > of pg_stat_replication.replay_lsn on the primary. Perhaps you have > an other idea? Hmm, please see attached v4 which just shifts the test to the middle, like v1. When I run the test without the code change, the test hangs as expected in: # Now the standby should catch up. $node_primary->wait_for_catchup('standby', 'replay'); and passes with the code change, as expected. I can't explain why the test doesn't freeze up in v3 in wait_for_catchup() at the end. As for checking on the wait event, since we only signal the standby after performing the check, that should be fine. Nothing else would be sending a SIGHUP before the check. Is that assumption correct? > When using wait_for_catchup(), I would recommend to list "replay" for > this test, even if that's the default mode, to make clear what the > intention is. Done. Regards, Soumyadeep (VMware)
Вложения
В списке pgsql-hackers по дате отправления: