Re: Concurrency issue in pg_rewind
От | Heikki Linnakangas |
---|---|
Тема | Re: Concurrency issue in pg_rewind |
Дата | |
Msg-id | ac2f431b-40dc-ca2b-b8c1-deb3c621b3f6@iki.fi обсуждение исходный текст |
Ответ на | Re: Concurrency issue in pg_rewind (Alexander Kukushkin <cyberdemn@gmail.com>) |
Ответы |
Re: Concurrency issue in pg_rewind
|
Список | pgsql-hackers |
On 18/09/2020 10:17, Alexander Kukushkin wrote: > At the same time, pg_rewind due to such "fatal" error leaves PGDATA in > an inconsistent state with empty pg_control file, this is totally bad > and easily fixable. We want the specific file to be absent and it is > already absent, why should it be a fatal error and not warning? Whenever pg_rewind runs into something unexpected, it fails loudly, so that the administrator can re-initialize from a base backup. That's the general rule. If a file goes missing while pg_rewind is running, that is unexpected. It could be a sign that the server was started concurrently, or another pg_rewind was started against it, for example. I feel that we could make an exception of some sort here, but I'm not sure what exactly. I don't feel comfortable just downgrading the unexpected ENOENT on unlink() to warning in all cases. Besides, scary warnings that you routinely ignore is not good either. I have a hard time coming up with a general rule and justification that's not just "do X because WAL-G does Y". pg_rewind failing because WAL-G removed a file unexpectedly is one problem, but another is that the restore_command might get confused if a pg_rewind removes a file that restore_command needs. This is hard when restore_command does things in the background, and there's no communication between the background process and pg_rewind. The general principle is that pg_rewind is equivalent to overwriting the target with the source, only faster. Perhaps pg_wal should be an exception, and pg_rewind should leave alone any files under pg_wal that it doesn't recognize. - Heikki
В списке pgsql-hackers по дате отправления: