Re: BUG #13368: standby cluster immediately promotes after pg_basebackup from previously promoted master
От | Michael Paquier |
---|---|
Тема | Re: BUG #13368: standby cluster immediately promotes after pg_basebackup from previously promoted master |
Дата | |
Msg-id | CAB7nPqTzr3fySUdTNmcOUQxAJk7m7V9eOXqfcvBYvYoGiErsUg@mail.gmail.com обсуждение исходный текст |
Ответ на | BUG #13368: standby cluster immediately promotes after pg_basebackup from previously promoted master (feikesteenbergen@gmail.com) |
Ответы |
Re: BUG #13368: standby cluster immediately promotes after
pg_basebackup from previously promoted master
|
Список | pgsql-bugs |
On Thu, May 28, 2015 at 7:07 PM, <feikesteenbergen@gmail.com> wrote: > The following bug has been logged on the website: > > Bug reference: 13368 > Logged by: Feike Steenbergen > Email address: feikesteenbergen@gmail.com > PostgreSQL version: 9.4.2 > Operating system: Debian 8.0 x86_64 > Description: > > We sometimes see a standby server promoting itself to master immediately. > > Analysis shows us that the master still has a promote file in the PGDATA > directory. We assume the presence of the promote file (which is copied > by pg_basebackup) is triggering the promotion. If there is a promote file in PGDATA when a standby starts up, promotion will be triggered. > The master itself previously was a standby server. The promotion was done > using pg_ctl promote. Analysis of our logs show that we sent pg_ctl promote > twice to this cluster, this also is reflected in the server log, > "server promoting" shows up twice. In this case promotion is triggered by CheckForStandbyTrigger(), where the promote file is unlinked. > Some testing shows us that in some cases, when pg_ctl promote is called > multiple > times, a promote file is left in the PGDATA directory, even though the > cluster > has been succesfully promoted and is accepting read/write queries. This is not surprising, pg_ctl bases its analysis that a node needs to be promoted if recovery.conf exists or not, and there is an interval of time between which recovery.conf is removed and the promotion is actually triggered, so you can create a promote file even after even sending SIGUSR1 to the standby's postmaster > We will try to workaround this issue by ensuring we do not send multiple > promote request using pg_ctl to the same cluster. Well, we could for example have the server switch promote to promote_done in CheckForStandbyTrigger() and then unlink it when recovery.conf is switched to .done. Opinions are welcome on the matter. -- Michael
В списке pgsql-bugs по дате отправления: