Re: BUG #11264: Auto vacuum wraparound job blocking everything - trying again
От | David Gould |
---|---|
Тема | Re: BUG #11264: Auto vacuum wraparound job blocking everything - trying again |
Дата | |
Msg-id | 20141001140945.71681b0f@jekyl.lan обсуждение исходный текст |
Ответ на | Re: BUG #11264: Auto vacuum wraparound job blocking everything (David Gould <daveg@sonic.net>) |
Список | pgsql-bugs |
I sent the following a couple days ago but it seems to have been overlooked. I think this issue may impact a lot of 9.3 sites that have upgraded from earlier versions. Please take a look at this. -dg On Tue, 30 Sep 2014 00:44:11 +0000 (UTC) David Gould <daveg@sonic.net> wrote: > Alvaro Herrera <alvherre <at> 2ndquadrant.com> writes: > > > Well, yes, 9.3.4 had a bug fixed by this commit: > > > > Author: Bruce Momjian <bruce <at> momjian.us> > > Branch: master [a61daa14d] 2014-07-02 15:29:38 -0400 > > Branch: REL9_4_STABLE [b446a384b] 2014-07-02 15:29:38 -0400 > > Branch: REL9_3_STABLE Release: REL9_3_5 [3d2e18510] 2014-07-02 15:29:38 > -0400 > > > > pg_upgrade: preserve database and relation minmxid values > > > > Also set these values for pre-9.3 old clusters that don't have values > to > > preserve. > > > > Analysis by Alvaro > > > > Backpatch through 9.3 > > > > > How do we fix the current issue with this one server? Is there an easy > fix? > > > Thanks. > > > > As far as I am aware, you should > > UPDATE pg_database SET datminmxid=20783 > > > > and that should fix it. The oldestMulti value in pg_control would get > > updated by itself some time later. If you experience stalls before > > oldestMulti fixes itself, you could stop the server (cleanly!) and then > > pg_resetxlog -m x,y where x is the correct nextMulti value from > > pg_controldata and y is 20783. > > We ran into this on two instances last night and I'm starting to worry that > many others may hit it as well. As far as I can tell any postgres instance > that was upgraded to 9.3 by a pre-9.3.5 pg_upgrade is at risk as they all > will have: > > Latest checkpoint's oldestMultiXid: 1 > > The workaround above is good if you still have the old cluster around from > before the upgrade. We did not, that was all cleaned up months ago. Which > raises the question: how do you fix a 9.3 instance that has oldestMultiXid = > 1 without access to the pre-upgrade instance. That is, where do you get the > correct value of oldestMultiXid to set into pg_database.datxminmxid? > > I took a guess that the oldest pg_class.relminmxid for all the tables in all > the databases would be ok and updated pg_database.datminmxid witt that. That > is, in each database I ran: > > select relname, relminmxid, age(relminmxid) > from pg_class > where relkind = 'r' > and age(relminmxid) < 2147483647 > order by 3 desc limit 1 ; > > And then used the oldest one to update pg_database.datminmxid. After a > checkpoint and shutdown/restart I see that it was written to pg_controldata > too. Afterwards I was able to run a vacuum freeze on the problem table. > > Questions: > > 1) Is the procedure above safe and effective for this, or did I just hose my > databases? > > 2) If the procedure above is not safe, what do we do instead? > > 3) Is this likely to affect all 9.3.x<5 upgraded databases? If so, how do we > fix it before too much damage happens in the wider world? \ -- David Gould daveg <at> sonic.net If simplicity worked, the world would be overrun with insects.
В списке pgsql-bugs по дате отправления: