Re: BUG #13368: standby cluster immediately promotes after pg_basebackup from previously promoted master

Поиск
Список
Период
Сортировка
От Fujii Masao
Тема Re: BUG #13368: standby cluster immediately promotes after pg_basebackup from previously promoted master
Дата
Msg-id CAHGQGwH0qnvHqxSF6LUXmMCLQxT8NtZHsJN7ij0sK=AW1DC3Aw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: BUG #13368: standby cluster immediately promotes after pg_basebackup from previously promoted master  (Michael Paquier <michael.paquier@gmail.com>)
Ответы Re: BUG #13368: standby cluster immediately promotes after pg_basebackup from previously promoted master  (Fujii Masao <masao.fujii@gmail.com>)
Список pgsql-bugs
On Fri, Jul 3, 2015 at 4:03 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Thu, Jul 2, 2015 at 10:00 PM, Fujii Masao wrote:
>>> At the beginning of
>>> StartupXLOG() before parsing recovery.conf we could check for the
>>> existence of promotion trigger files and unlink them unconditionally.
>>
>> There seems to be still race condition: postmaster can receive SIGUSR1
>> before the startup process removes the promotion trigger file. Then
>> the variable promote_triggered can be set to true unexpectedly.
>
> Ah, yes. That's indeed possible.
>
>> So, what about making postmaster remove the trigger file unconditionally
>> before the startup process is forked, instead? For example, just after
>> PostmasterMain() calls RemovePgTempFiles()?
>
> Sounds fine, no other processes are running at this point.

Thanks for updating the patch!

+                * Skip promote signal files. Those files may have
been created by
+                * pg_ctl to trigger a promotion but they do not
belong to a base
+                * backup.
+                */
+               if (strcmp(de->d_name, PROMOTE_SIGNAL_FILE) == 0 ||
+                       strcmp(de->d_name, FALLBACK_PROMOTE_SIGNAL_FILE) == 0)
+                       continue;

Do we really want this? This is not required because the files are
always removed at the server startup. Also the backup performance
gain by skipping them would be almost zero since their size is zero.
That is, no need to add this for performance. OTOH, it's harmless
to add this. But I don't want to enlarge the "skip file list" for
pg_basebackup beyond necessary. So I removed this from the patch.

BTW, if we really want this, not only pg_basebackup but also pg_rewind
would need to skip the files.

+        * Remove any trigger file able to trigger promotion, this is safe from
+        * race conditions involving signals as no other processes are running
+        * yet.
+        */
+       RemoveStandbyTrigger();

I changed the function name to RemovePromoteSignalFiles().

I added more detail comment about the race condition that we're
talking about, into here.

Updated version of the patch attached.

Regards,

--
Fujii Masao

Вложения

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Michael Paquier
Дата:
Сообщение: Re: BUG #13368: standby cluster immediately promotes after pg_basebackup from previously promoted master
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: BUG #13126: table constraint loses its comment