Re: Changeset Extraction v7.0 (was logical changeset generation)

Поиск

Список

Период

Сортировка

От	Andres Freund
Тема	Re: Changeset Extraction v7.0 (was logical changeset generation)
Дата	23 января 2014 г. 12:05:10
Msg-id	20140123120503.GB7182@awork2.anarazel.de обсуждение исходный текст
Ответ на	Re: Changeset Extraction v7.0 (was logical changeset generation) (Robert Haas <robertmhaas@gmail.com>)
Ответы	Re: Changeset Extraction v7.0 (was logical changeset generation)
Список	pgsql-hackers

Дерево обсуждения

Hi,

On 2014-01-22 13:00:44 -0500, Robert Haas wrote:
> Well, apparently, one is going to PANIC and reinitialize the system.
> I presume that upon reinitialization we'll decide that the slot is
> gone, and thus won't recreate it in shared memory.

Yea, and if it's half-gone we'll continue deletion. And since yesterday
evening we'll even fsync things during startup to handle scenarios
similar to 20140122162115.GL21170@alap3.anarazel.de .

> Of course, if the entire system suffers a hard power failure after that and before the
> directory is succesfully fsync'd, then the slot could reappear on the
> next startup. Which is also exactly what would happen if we removed
> the slot from shared memory after doing the unlink, and then the
> system suffered a hard power failure before the directory contents
> made it to disk.  Except that we also panicked.

Yes, but that could only happen as long as no relevant data has been
lost since we hold relevant locks during this.

> In the case of shared buffers, the way we handle fsync failures is by
> not allowing the system to checkpoint until all of the fsyncs succeed.

I don't think shared buffers fsyncs are the apt comparison. It's more
something like UpdateControlFile(). Which PANICs.

I really don't get why you fight PANICs in general that much. There are
some nasty PANICs in postgres which can happen in legitimate situations,
which should be made to fail more gracefully, but this surely isn't one
of them. We're doing rename(), unlink() and rmdir(). That's it.
We should concentrate on the ones that legitimately can happen, not the
ones created by an admin running a chmod -R 000 . ; rm -rf $PGDATA or
mount -o remount,ro /. We don't increase reliability by a bit adding
codepaths that will never get tested.

> If there's an OS-level reset before that happens, WAL replay will
> perform the same buffer modifications over again and the next
> checkpoint will again try to flush them to disk and will not complete
> unless it does.  That forms a closed system where we never advance the
> redo pointer over the covering WAL record until the changes it covers
> are on the disk.  But I don't think this code has any similar
> interlock; if it does, I missed it.

No, it doesn't (until the first rename() at least), but the number of
failure scenarios is far smaller.

Greetings,

Andres Freund

--Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Changeset Extraction v7.0 (was logical changeset generation)