Re: BUG #13368: standby cluster immediately promotes after pg_basebackup from previously promoted master
От | Fujii Masao |
---|---|
Тема | Re: BUG #13368: standby cluster immediately promotes after pg_basebackup from previously promoted master |
Дата | |
Msg-id | CAHGQGwFFn_xmvP5bXpVYU363a=wG2GRt5o25VQ5AbiHqnPJrdw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: BUG #13368: standby cluster immediately promotes after pg_basebackup from previously promoted master (Michael Paquier <michael.paquier@gmail.com>) |
Ответы |
Re: BUG #13368: standby cluster immediately promotes after
pg_basebackup from previously promoted master
|
Список | pgsql-bugs |
On Mon, Jun 1, 2015 at 5:19 PM, Michael Paquier <michael.paquier@gmail.com> wrote: > On Thu, May 28, 2015 at 7:07 PM, <feikesteenbergen@gmail.com> wrote: >> The following bug has been logged on the website: >> >> Bug reference: 13368 >> Logged by: Feike Steenbergen >> Email address: feikesteenbergen@gmail.com >> PostgreSQL version: 9.4.2 >> Operating system: Debian 8.0 x86_64 >> Description: >> >> We sometimes see a standby server promoting itself to master immediately. >> >> Analysis shows us that the master still has a promote file in the PGDATA >> directory. We assume the presence of the promote file (which is copied >> by pg_basebackup) is triggering the promotion. > > If there is a promote file in PGDATA when a standby starts up, > promotion will be triggered. > >> The master itself previously was a standby server. The promotion was done >> using pg_ctl promote. Analysis of our logs show that we sent pg_ctl promote >> twice to this cluster, this also is reflected in the server log, >> "server promoting" shows up twice. > > In this case promotion is triggered by CheckForStandbyTrigger(), where > the promote file is unlinked. > >> Some testing shows us that in some cases, when pg_ctl promote is called >> multiple >> times, a promote file is left in the PGDATA directory, even though the >> cluster >> has been succesfully promoted and is accepting read/write queries. > > This is not surprising, pg_ctl bases its analysis that a node needs to > be promoted if recovery.conf exists or not, and there is an interval > of time between which recovery.conf is removed and the promotion is > actually triggered, so you can create a promote file even after even > sending SIGUSR1 to the standby's postmaster > >> We will try to workaround this issue by ensuring we do not send multiple >> promote request using pg_ctl to the same cluster. > > Well, we could for example have the server switch promote to > promote_done in CheckForStandbyTrigger() and then unlink it when > recovery.conf is switched to .done. Opinions are welcome on the > matter. Or we can just always remove the signal file at the end of recovery. That filename switch seems unnecessary. In addition to that change, we should make pg_basebackup skip the signal file? Regards, -- Fujii Masao
В списке pgsql-bugs по дате отправления: