Re: run pg_rewind on an uncleanly shut down cluster.
От | Michael Paquier |
---|---|
Тема | Re: run pg_rewind on an uncleanly shut down cluster. |
Дата | |
Msg-id | CAB7nPqTfgJmRREeHWJ1e9+YG9F2SR-VwGfSrspzY053bm1kvHQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: run pg_rewind on an uncleanly shut down cluster. (Oleksii Kliukin <alexk@hintbits.com>) |
Список | pgsql-hackers |
On Tue, Oct 6, 2015 at 6:04 PM, Oleksii Kliukin <alexk@hintbits.com> wrote: > Does pg_rewind actually rely on the cluster being rewound to finish > recovery? That's not mandatory AFAIK. I think that Heikki has just implemented it in the safest way possible for a first shot. That's something we could relax in the future. > If not, than it would be a good idea to add —force flag to force the > pg_rewind to ignore the state check, as you suggested in this thread: > http://www.postgresql.org/message-id/flat/CAF8Q-Gw1HBKzpSEVtotLg=DR+Ee-6q59qQfhY5tor3FYAenyrA@mail.gmail.com#CAF8Q-Gw1HBKzpSEVtotLg=DR+Ee-6q59qQfhY5tor3FYAenyrA@mail.gmail.com Another one would be to remove this check of pg_control by something closer to what pg_ctl status does with postmaster.pid for example. And to perhaps add a safeguard to prevent a concurrent user to start the target node when pg_rewind run begins. > Well, checking the source node looks like an option that does not require > providing any additional information by DBA, as the connection string or the > path to the data dir is already there. It would be nice if pg_rewind could > fetch WAL from the given restore_command though, or even use the command > already there in recovery.conf (if the node being recovered is a replica, > which I guess is a pretty common case). Kind of. Except that we would want a user to be able to pass a custom restore_command for more flexibility that would be used by pg_rewind itself. > Anyway, thank you for describing the issue. In my case, it seems I solved it > by removing the files from the archive_status directory of the former master > (the node being rewound). This makes PostgreSQL forget that it has to remove > an already archived (but still required for pg_rewind) segment (I guess it > does it during stop when the checkpoint is issued). Afterwards, postgres > starts it in a single user mode with archive_command=false and > archive_mode=on, to make sure no segments are archived/removed, and stopped > right afterwards with: Interesting. That's one way to go. > Afterwards, pg_rewind runs on the cluster without any noticeable issues. > Since the node is not going to continue as a master and the contents of > pg_xlog/archive_status is changed after pg_rewind anyway, I don’t think any > data is lost after initial removal of archive_status files. Yep. Its content is replaced by everything from the source node. -- Michael
В списке pgsql-hackers по дате отправления: