Re: BUG #14999: pg_rewind corrupts control file global/pg_control

Поиск

Список

Период

Сортировка

От	Tom Lane
Тема	Re: BUG #14999: pg_rewind corrupts control file global/pg_control
Дата	4 апреля 2018 г. 21:50:12
Msg-id	22961.1522867812@sss.pgh.pa.us обсуждение исходный текст
Ответ на	Re: BUG #14999: pg_rewind corrupts control file global/pg_control (Michael Paquier <michael@paquier.xyz>)
Ответы	Re: BUG #14999: pg_rewind corrupts control file global/pg_control
Список	pgsql-bugs

Дерево обсуждения

Michael Paquier <michael@paquier.xyz> writes:
> So after that I falled back to your patch and began testing it, which is
> where I noticed that we can *never* give the insurance to recover a data
> folder on which an error has happened in the middle of a pg_rewind.  The
> reason for that is quite simple: even if the truncation has been moved
> down to the moment where the first chunk of a file is received, you may
> have already done work on some relation files.  Particularly, some of
> them may have been truncated down to a given size without a new range of
> blocks fetched from the source.  So the data folder would be in an
> inconsistent state if trying to rewind it again.

Yes, we certainly cannot guarantee that failure partway through pg_rewind
leaves a consistent state of the target data directory.  It is likely
worth pointing that out in the documentation.  Whether we can or should
do anything about it is a different question.

When I first started looking at this thread, I wondered if maybe somebody
had had in mind to create an active defense against starting a postmaster
in an inconsistent target cluster, by dint of intentionally truncating
pg_control before the transfer starts and not making it valid again till
the very end.  It's now clear from looking at the code that that's not
what's going on :-(.  But I wonder how hard it would be to make it so,
and whether that'd be worth doing if it's not too hard.

Actually, probably a safer way to attack that would be to remove or
rename the topmost PG_VERSION file, and then put it back afterwards.
That'd be far easier to recover from manually, if need be, than
clobbering pg_control.

In any case, that seems separate from the question of what to do with
read-only files in the data directory.  Should we push forward with
committing Michael's previous patch, and leave that issue for later?

            regards, tom lane

В списке pgsql-bugs по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: BUG #14999: pg_rewind corrupts control file global/pg_control