Re: run pg_rewind on an uncleanly shut down cluster.

Поиск

Список

Период

Сортировка

От	Michael Paquier
Тема	Re: run pg_rewind on an uncleanly shut down cluster.
Дата	6 октября 2015 г. 09:32:21
Msg-id	CAB7nPqTfgJmRREeHWJ1e9+YG9F2SR-VwGfSrspzY053bm1kvHQ@mail.gmail.com обсуждение исходный текст
Ответ на	Re: run pg_rewind on an uncleanly shut down cluster. (Oleksii Kliukin <alexk@hintbits.com>)
Список	pgsql-hackers

Дерево обсуждения

On Tue, Oct 6, 2015 at 6:04 PM, Oleksii Kliukin <alexk@hintbits.com> wrote:
> Does pg_rewind actually rely on the cluster being rewound to finish
> recovery?

That's not mandatory AFAIK. I think that Heikki has just implemented
it in the safest way possible for a first shot. That's something we
could relax in the future.

> If not, than it would be a good idea to add —force flag to force the
> pg_rewind to ignore the state check, as you suggested in this thread:
>
http://www.postgresql.org/message-id/flat/CAF8Q-Gw1HBKzpSEVtotLg=DR+Ee-6q59qQfhY5tor3FYAenyrA@mail.gmail.com#CAF8Q-Gw1HBKzpSEVtotLg=DR+Ee-6q59qQfhY5tor3FYAenyrA@mail.gmail.com

Another one would be to remove this check of pg_control by something
closer to what pg_ctl status does with postmaster.pid for example. And
to perhaps add a safeguard to prevent a concurrent user to start the
target node when pg_rewind run begins.

> Well, checking the source node looks like an option that does not require
> providing any additional information by DBA, as the connection string or the
> path to the data dir is already there. It would be nice if pg_rewind could
> fetch WAL from the given restore_command though, or even use the command
> already there in recovery.conf (if the node being recovered is a replica,
> which I guess is a pretty common case).

Kind of. Except that we would want a user to be able to pass a custom
restore_command for more flexibility that would be used by pg_rewind
itself.

> Anyway, thank you for describing the issue. In my case, it seems I solved it
> by removing the files from the archive_status directory of the former master
> (the node being rewound). This makes PostgreSQL forget that it has to remove
> an already archived (but still required for pg_rewind) segment (I guess it
> does it during stop when the checkpoint is issued). Afterwards, postgres
> starts it in a single user mode with archive_command=false and
> archive_mode=on, to make sure no segments are archived/removed, and stopped
> right afterwards with:

Interesting. That's one way to go.

> Afterwards, pg_rewind runs on the cluster without any noticeable issues.
> Since the node is not going to continue as a master and the contents of
> pg_xlog/archive_status is changed after pg_rewind anyway, I don’t think any
> data is lost after initial removal of archive_status files.

Yep. Its content is replaced by everything from the source node.
--
Michael

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: run pg_rewind on an uncleanly shut down cluster.