Обсуждение: ERROR: cannot GetMultiXactIdMembers() during recovery

Поиск

Список

Период

Сортировка

ERROR: cannot GetMultiXactIdMembers() during recovery

От

Marko Tiikkaja

Дата:

23 февраля 2015 г., 17:00:45

Hi,

Andres asked me on IRC to report this here.  Since we upgraded our
standby servers to 9.1.15 (though the master is still running 9.1.14),
we've seen the error in $SUBJECT a number of times.  I managed to
reproduce it today by running the same query over and over again, and
attached is the back trace.

Let me know if you need any additional information.


.m

Вложения

bt.txt

Re: ERROR: cannot GetMultiXactIdMembers() during recovery

От

Andres Freund

Дата:

23 февраля 2015 г., 17:13:10

Hi,

On 2015-02-23 15:00:35 +0100, Marko Tiikkaja wrote:
> Andres asked me on IRC to report this here.  Since we upgraded our standby
> servers to 9.1.15 (though the master is still running 9.1.14), we've seen
> the error in $SUBJECT a number of times.

FWIW, I think this is just as borked in 9.1.14 and will likely affect
all of 9.0 - 9.2. The problem is that in those releases multixacts
aren't maintained on the standby in a way that allows access.

index_getnext() itself is actually pretty easy to fix, it already checks
whether the scan started while in recovery when using the result of the
error triggering HeapTupleSatisfiesVacuum(), just too late.  I don't
remember other HTSV callers that can run in recovery, given that DDL is
obviously impossible and we don't support serializable while in
recovery.

Alternatively we could make MultiXactIdIsRunning() return false < 9.3
when in recovery. I think that'd end up fixing things, but it seems
awfully fragile to me.

I do see a HTSU in pgrowlocks.c - that's not really safe during recovery< 9.3, given it accesses multixacts. I guess it
needsto throw an error.

I wonder if we shouldn't put a Assert() in HTSV/HTSU to prevent such
problems.

Greetings,

Andres Freund
-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services

Re: ERROR: cannot GetMultiXactIdMembers() during recovery

От

Marko Tiikkaja

Дата:

15 мая 2015 г., 16:17:31

Hi hackers,

Any chance to get this fixed in time for 9.1.16?


.m

Re: ERROR: cannot GetMultiXactIdMembers() during recovery

От

Alvaro Herrera

Дата:

15 мая 2015 г., 21:03:07

Andres Freund wrote:

> Alternatively we could make MultiXactIdIsRunning() return false < 9.3
> when in recovery. I think that'd end up fixing things, but it seems
> awfully fragile to me.

Hm, why fragile?  It seems a pretty decent answer -- pre-9.3, it's not
possible for a tuple to be "locked" in recovery, is it?  I mean, in the
standby you can't lock it nor update it; the only thing you can do is
read (select), and that is not affected by whether there is a multixact
in it.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: ERROR: cannot GetMultiXactIdMembers() during recovery

От

Simon Riggs

Дата:

15 мая 2015 г., 21:19:41

On 15 May 2015 at 19:03, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

Andres Freund wrote:

> Alternatively we could make MultiXactIdIsRunning() return false < 9.3
> when in recovery. I think that'd end up fixing things, but it seems
> awfully fragile to me.

Hm, why fragile? It seems a pretty decent answer -- pre-9.3, it's not
possible for a tuple to be "locked" in recovery, is it? I mean, in the
standby you can't lock it nor update it; the only thing you can do is
read (select), and that is not affected by whether there is a multixact
in it.

It can't return true and won't ever change for <9.3 so I don't see what the objection is.

Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: ERROR: cannot GetMultiXactIdMembers() during recovery

От

Alvaro Herrera

Дата:

18 мая 2015 г., 19:47:29

Marko Tiikkaja wrote:
> Hi hackers,
>
> Any chance to get this fixed in time for 9.1.16?

I hope you had pinged some days earlier.  Here's a patch, but I will
wait until this week's releases have been tagged before pushing.

I checked 9.2, and it doesn't look like it's subject to the same
problem: instead of HeapTupleSatisfiesVacuum, it uses
HeapTupleIsSurelyDead in the equivalent place.  Still, I think it's
saner to apply the same bug because as Andres notes the problem might
still be present in pgrowlocks and who knows what else.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Вложения

multixact-in-recovery.patch

Re: ERROR: cannot GetMultiXactIdMembers() during recovery

От

Tom Lane

Дата:

18 мая 2015 г., 19:59:59

Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Marko Tiikkaja wrote:
>> Any chance to get this fixed in time for 9.1.16?

> I hope you had pinged some days earlier.  Here's a patch, but I will
> wait until this week's releases have been tagged before pushing.

Is this a recent regression, or has it been busted all along in those
branches?

If the former, maybe we should take the risk of fixing it today
(the patch certainly looks safe enough).  But if it's been this
way a long time and nobody noticed till now, I'd agree with waiting.
        regards, tom lane

Re: ERROR: cannot GetMultiXactIdMembers() during recovery

От

Andres Freund

Дата:

18 мая 2015 г., 20:02:46

On 2015-05-18 12:59:47 -0400, Tom Lane wrote:
> If the former, maybe we should take the risk of fixing it today
> (the patch certainly looks safe enough).  But if it's been this
> way a long time and nobody noticed till now, I'd agree with waiting.

It's a old regression, and nobody noticed it until Marko a couple months
back.

Greetings,

Andres Freund

Re: ERROR: cannot GetMultiXactIdMembers() during recovery

От

Simon Riggs

Дата:

18 мая 2015 г., 20:12:20

On 18 May 2015 at 12:59, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Marko Tiikkaja wrote:
>> Any chance to get this fixed in time for 9.1.16?

> I hope you had pinged some days earlier. Here's a patch, but I will
> wait until this week's releases have been tagged before pushing.

Is this a recent regression, or has it been busted all along in those
branches?

If the former, maybe we should take the risk of fixing it today
(the patch certainly looks safe enough). But if it's been this
way a long time and nobody noticed till now, I'd agree with waiting.

That's a very low risk fix. It's more like a should-have-been-a-basic-check.

Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: ERROR: cannot GetMultiXactIdMembers() during recovery

От

Alvaro Herrera

Дата:

18 мая 2015 г., 20:13:19

Tom Lane wrote:
> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> > Marko Tiikkaja wrote:
> >> Any chance to get this fixed in time for 9.1.16?
> 
> > I hope you had pinged some days earlier.  Here's a patch, but I will
> > wait until this week's releases have been tagged before pushing.
> 
> Is this a recent regression, or has it been busted all along in those
> branches?
> 
> If the former, maybe we should take the risk of fixing it today
> (the patch certainly looks safe enough).  But if it's been this
> way a long time and nobody noticed till now, I'd agree with waiting.

Hmm, AFAICS the problematic check was introduced by this commit:

commit 9f1e051adefb2f29e757cf426b03db20d3f8a26d
Author: Alvaro Herrera <alvherre@alvh.no-ip.org>
Date:   Fri Nov 29 11:26:41 2013 -0300

so it isn't hot off the oven, but it is a regression.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: ERROR: cannot GetMultiXactIdMembers() during recovery

От

Andres Freund

Дата:

18 мая 2015 г., 20:21:54

On 2015-05-18 14:13:51 -0300, Alvaro Herrera wrote:
> Hmm, AFAICS the problematic check was introduced by this commit:
> 
> commit 9f1e051adefb2f29e757cf426b03db20d3f8a26d
> Author: Alvaro Herrera <alvherre@alvh.no-ip.org>
> Date:   Fri Nov 29 11:26:41 2013 -0300
> 
> so it isn't hot off the oven, but it is a regression.

Hasn't that just changed the symptoms? I don't recall exactly, but my
recollection is that the multixact code isn't ready at that point and
hasn't initialized a bunch of important variables yet. Leading to errors
in the SLRU etc.

Greetings,

Andres Freund

Re: ERROR: cannot GetMultiXactIdMembers() during recovery

От

Alvaro Herrera

Дата:

18 мая 2015 г., 20:35:54

Andres Freund wrote:
> On 2015-05-18 14:13:51 -0300, Alvaro Herrera wrote:
> > Hmm, AFAICS the problematic check was introduced by this commit:
> > 
> > commit 9f1e051adefb2f29e757cf426b03db20d3f8a26d
> > Author: Alvaro Herrera <alvherre@alvh.no-ip.org>
> > Date:   Fri Nov 29 11:26:41 2013 -0300
> > 
> > so it isn't hot off the oven, but it is a regression.
> 
> Hasn't that just changed the symptoms? I don't recall exactly, but my
> recollection is that the multixact code isn't ready at that point and
> hasn't initialized a bunch of important variables yet. Leading to errors
> in the SLRU etc.

Not sure about that.  The page limits etc aren't set yet so you can't
create new multis, nor truncate appropriately, but just reading one
should have worked.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: ERROR: cannot GetMultiXactIdMembers() during recovery

От

Alvaro Herrera

Дата:

18 мая 2015 г., 23:45:15

Simon Riggs wrote:
> On 15 May 2015 at 19:03, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> 
> > Andres Freund wrote:
> >
> > > Alternatively we could make MultiXactIdIsRunning() return false < 9.3
> > > when in recovery. I think that'd end up fixing things, but it seems
> > > awfully fragile to me.
> >
> > Hm, why fragile?  It seems a pretty decent answer -- pre-9.3, it's not
> > possible for a tuple to be "locked" in recovery, is it?  I mean, in the
> > standby you can't lock it nor update it; the only thing you can do is
> > read (select), and that is not affected by whether there is a multixact
> > in it.
> 
> It can't return true and won't ever change for <9.3 so I don't see what the
> objection is.

Pushed.

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: ERROR: cannot GetMultiXactIdMembers() during recovery

От

Tom Lane

Дата:

21 мая 2015 г., 00:48:06

Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Marko Tiikkaja wrote:
>> Any chance to get this fixed in time for 9.1.16?

> I hope you had pinged some days earlier.  Here's a patch, but I will
> wait until this week's releases have been tagged before pushing.

BTW, I meant to update this thread but forgot until now: these changes
did wind up included in the final tarballs for 9.2 and before, on account
of the re-wrap the next day.  In the rush to re-do the wrap, I forgot
that I should've added entries to the release notes for these commits :-(
So the documentation doesn't mention the fix, but it's there.
        regards, tom lane

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: ERROR: cannot GetMultiXactIdMembers() during recovery

Вложения

Вложения