Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby
От | Andres Freund |
---|---|
Тема | Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby |
Дата | |
Msg-id | 20131209182701.GD9519@awork2.anarazel.de обсуждение исходный текст |
Ответ на | Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby (Serge Negodyuck <petr@petrovich.kiev.ua>) |
Ответы |
Re: BUG #8673: Could not open file
"pg_multixact/members/xxxx" on slave during hot_standby
Re: BUG #8673: Could not open file "pg_multixact/members/xxxx" on slave during hot_standby |
Список | pgsql-bugs |
Hi, On 2013-12-09 17:49:34 +0200, Serge Negodyuck wrote: > On master there are files from 0000 to 14078 > > On slave there were absent files from A1xx to FFFF > They were the oldest ones. (October, November) Some analysis later, I am pretty sure that the origin is a longstanding problem and not connected to 9.3.[01] vs 9.3.2. The above referenced 14078 file is exactly the last page before a members wraparound: (gdb) p/x (1L<<32)/(MULTIXACT_MEMBERS_PER_PAGE * SLRU_PAGES_PER_SEGMENT) $10 = 0x14078 So, what happened is that enough multixacts where created, that the members slru wrapped around. It's not unreasonable for the members slru to wrap around faster then the offsets one - after all we create at least two entries into members for every offset entry. Also in 9.3+ there fit more xids on a offset than a members page. When truncating, we first read the offset, to know where we currently are in members, and then truncate both from their respective point. Since we've wrapped around in members we very well might remove content we actually need. I've recently remarked that I find it dangerous that we only do anti-wraparound stuff for pg_multixact/offsets, not for /members. So, here we have the proof that that's bad. This is an issue in <9.3 as well. It might, in some sense, even be worse there, because we never vacuum old multis away. But on the other hand, the growths of multis is slower there and we look into old multis less frequently. The only reason that you saw the issue on the standby first is that the truncation code is called more frequently there. Afaics it will happen, sometime in the future, on the master as well. I think problems should be preventable if you issue a systemwide VACUUM FREEZE, but please let others chime in before you execute it. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
В списке pgsql-bugs по дате отправления: