Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby

Поиск

Список

Период

Сортировка

От	Andres Freund
Тема	Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby
Дата	9 декабря 2013 г. 18:27:07
Msg-id	20131209182701.GD9519@awork2.anarazel.de обсуждение исходный текст
Ответ на	Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby (Serge Negodyuck <petr@petrovich.kiev.ua>)
Ответы	Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby
Список	pgsql-bugs

Дерево обсуждения

Hi,

On 2013-12-09 17:49:34 +0200, Serge Negodyuck wrote:
> On master there are files from 0000 to 14078
>
> On slave there were absent files from A1xx to FFFF
> They were  the oldest ones. (October, November)

Some analysis later, I am pretty sure that the origin is a longstanding
problem and not connected to 9.3.[01] vs 9.3.2.

The above referenced 14078 file is exactly the last page before a
members wraparound:
(gdb) p/x (1L<<32)/(MULTIXACT_MEMBERS_PER_PAGE * SLRU_PAGES_PER_SEGMENT)
$10 = 0x14078

So, what happened is that enough multixacts where created, that the
members slru wrapped around. It's not unreasonable for the members slru
to wrap around faster then the offsets one - after all we create at
least two entries into members for every offset entry. Also in 9.3+
there fit more xids on a offset than a members page.
When truncating, we first read the offset, to know where we currently
are in members, and then truncate both from their respective
point. Since we've wrapped around in members we very well might remove
content we actually need.

I've recently remarked that I find it dangerous that we only do
anti-wraparound stuff for pg_multixact/offsets, not for /members. So,
here we have the proof that that's bad.

This is an issue in <9.3 as well. It might, in some sense, even be worse
there, because we never vacuum old multis away. But on the other hand,
the growths of multis is slower there and we look into old multis less
frequently.

The only reason that you saw the issue on the standby first is that the
truncation code is called more frequently there. Afaics it will happen,
sometime in the future, on the master as well.

I think problems should be preventable if you issue a systemwide VACUUM
FREEZE, but please let others chime in before you execute it.

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

В списке pgsql-bugs по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby