Обсуждение: how is pitr replay interruption time determined?

Поиск

Список

Период

Сортировка

how is pitr replay interruption time determined?

От

Robert Treat

Дата:

27 августа 2007 г., 21:43:31

Today I was running some tests on one of our warm standbys. The way this test
works is I took a zfs snapshot of the filesystem, induce failover, of the
pitr host, then log in and play around with the data. Once we're done with
this, I rolled back the zfs snapshot, restart the standby instance, and let
it start doing wal_replay again.  All of this seems to work fine, but when I
start the replay instance again I get the following line in my logfile:

LOG:  database system was interrupted while in recovery at log time 2007-07-30
19:17:37 EDT

I am curious how this date is determined by postgres/pitr?  The standby had
processed wal_logs right through today before having taken the zfs snapshot,
so I would think it would have been interrupted at a log time of sometime
today.  Also the initial replay setup occured sometime was long before 07-30,
so that doesn't correspond to that date either, so I'm wondering what it is
that determines the point in time that replay thinks it is interuppted at.

FWIW, this does have practical implecations, like how many wal files to keep
around, in my case I had all the files going back to that date, so I only had
to wait for replay to catch up.... 5 hours later :-).    But it would be good
to be able to predetermine that kind of thing.

--
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

Re: how is pitr replay interruption time determined?

От

Tom Lane

Дата:

27 августа 2007 г., 23:17:09

Robert Treat <xzilla@users.sourceforge.net> writes:
> LOG:  database system was interrupted while in recovery at log time 2007-07-30
> 19:17:37 EDT

> I am curious how this date is determined by postgres/pitr?

It's the update timestamp in pg_control, which in essence is going to be
the completion time of the last checkpoint.  If the crash was during WAL
recovery, as it seems to be, it'd be the time the last "restart"
checkpoint was made.

I'm not sure why the phrase "at log time" is used.  It looks like
whoever wrote the message thought that the timestamp would be one taken
from the WAL data being replayed, but that's not what I see the code
doing.

            regards, tom lane

Re: how is pitr replay interruption time determined?

От

Robert Treat

Дата:

28 августа 2007 г., 21:40:05

On Monday 27 August 2007 19:17, Tom Lane wrote:
> Robert Treat <xzilla@users.sourceforge.net> writes:
> > LOG:  database system was interrupted while in recovery at log time
> > 2007-07-30 19:17:37 EDT
> >
> > I am curious how this date is determined by postgres/pitr?
>
> It's the update timestamp in pg_control, which in essence is going to be
> the completion time of the last checkpoint.  If the crash was during WAL
> recovery, as it seems to be, it'd be the time the last "restart"
> checkpoint was made.
>

Is there some way to force checkpoints on a db doing wal replay?

Also, I would think at a minimum that the last "restart" checkpoint would be
akin to when the db was last started, but looking at my replay instance I
see:

pg_control last modified:             Mon Aug 27 12:12:55 2007
Time of latest checkpoint:            Mon Jul 30 19:17:37 2007

The first date is around the time I restarted the instance, the second is
still a mystery to me. Perhaps I am misinterpreting your respone?

--
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

Re: how is pitr replay interruption time determined?

От

Tom Lane

Дата:

28 августа 2007 г., 21:59:49

Robert Treat <xzilla@users.sourceforge.net> writes:
> Is there some way to force checkpoints on a db doing wal replay?

No, it's hardwired to do it when it sees a checkpoint record in the WAL stream.

> pg_control last modified:             Mon Aug 27 12:12:55 2007
> Time of latest checkpoint:            Mon Jul 30 19:17:37 2007

After looking again at the code, the "last modified" time is the time
that a recovery checkpoint was last done, and the "latest checkpoint"
is the timestamp of the WAL-stream checkpoint record that triggered it.
In a situation where you're catching up on historical WAL they could be
far apart, but when a slave is just following the master there shouldn't
be a huge difference --- not more than the maximum time to fill a WAL
record and ship it over to the slave, for sure.

(BTW, I misread it before --- it looks like the "at log time" value
printed at startup *is* taken from the checkpoint record that it's
trying to roll forward from.)

Assuming that you're absorbing data from the master at a steady rate,
the only reason I can see for the timestamps to be so old is if the
"rm_safe_restartpoint" checks are always failing.  I seem to remember
that we found and fixed a bug that could cause something like that,
but I can't find any trace of it in the CVS logs.  Simon, do you
recall such a problem post-8.2.0?

            regards, tom lane

Re: how is pitr replay interruption time determined?

От

Tom Lane

Дата:

28 августа 2007 г., 22:50:23

Simon Riggs <simon@2ndquadrant.com> writes:
> Yeh, we traced a problem with GIN indexes to this cause in early June;
> Teodor fixed it quickly in REL8_2_STABLE, but that won't be available
> until 8.2.5.

Thanks, that's what I missed finding ... Robert, do you have any GIN
indexes in this DB?

> I'd be happier with a log message to say
>   ereport(DEBUG2,
>     (errmsg("RM %d not safe to record restart point at %X/%X",
>             rmid,
>             checkPoint->redo.xlogid,
>             checkPoint->redo.xrecoff)));
> to help trace such things in future.

As long as it's DEBUG2 or so, no objection here.

            regards, tom lane

Re: how is pitr replay interruption time determined?

От

Simon Riggs

Дата:

28 августа 2007 г., 23:05:59

On Tue, 2007-08-28 at 17:59 -0400, Tom Lane wrote:
> Robert Treat <xzilla@users.sourceforge.net> writes:
> > Is there some way to force checkpoints on a db doing wal replay?
>
> No, it's hardwired to do it when it sees a checkpoint record in the WAL stream.
>
> > pg_control last modified:             Mon Aug 27 12:12:55 2007
> > Time of latest checkpoint:            Mon Jul 30 19:17:37 2007
>
> After looking again at the code, the "last modified" time is the time
> that a recovery checkpoint was last done, and the "latest checkpoint"
> is the timestamp of the WAL-stream checkpoint record that triggered it.
> In a situation where you're catching up on historical WAL they could be
> far apart, but when a slave is just following the master there shouldn't
> be a huge difference --- not more than the maximum time to fill a WAL
> record and ship it over to the slave, for sure.
>
> (BTW, I misread it before --- it looks like the "at log time" value
> printed at startup *is* taken from the checkpoint record that it's
> trying to roll forward from.)

That's correct. Sorry for not replying earlier; just back from hols.

Jumping back to original thought: Robert, you should be using the last
checkpoint location, not the last time to decide which xlogs to remove.

> Assuming that you're absorbing data from the master at a steady rate,
> the only reason I can see for the timestamps to be so old is if the
> "rm_safe_restartpoint" checks are always failing.  I seem to remember
> that we found and fixed a bug that could cause something like that,
> but I can't find any trace of it in the CVS logs.  Simon, do you
> recall such a problem post-8.2.0?

Yeh, we traced a problem with GIN indexes to this cause in early June;
Teodor fixed it quickly in REL8_2_STABLE, but that won't be available
until 8.2.5.

I'd be happier with a log message to say

  ereport(DEBUG2,
    (errmsg("RM %d not safe to record restart point at %X/%X",
            rmid,
            checkPoint->redo.xlogid,
            checkPoint->redo.xrecoff)));

to help trace such things in future.

--
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com

Re: how is pitr replay interruption time determined?

От

Tom Lane

Дата:

28 августа 2007 г., 23:20:58

Simon Riggs <simon@2ndquadrant.com> writes:
>> As long as it's DEBUG2 or so, no objection here.

> OK, I'll do up a patch.

No need, done already.

            regards, tom lane

Re: how is pitr replay interruption time determined?

От

Simon Riggs

Дата:

28 августа 2007 г., 23:31:28

On Tue, 2007-08-28 at 18:49 -0400, Tom Lane wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
> > Yeh, we traced a problem with GIN indexes to this cause in early June;
> > Teodor fixed it quickly in REL8_2_STABLE, but that won't be available
> > until 8.2.5.
>
> Thanks, that's what I missed finding ... Robert, do you have any GIN
> indexes in this DB?
>
> > I'd be happier with a log message to say
> >   ereport(DEBUG2,
> >     (errmsg("RM %d not safe to record restart point at %X/%X",
> >             rmid,
> >             checkPoint->redo.xlogid,
> >             checkPoint->redo.xrecoff)));
> > to help trace such things in future.
>
> As long as it's DEBUG2 or so, no objection here.

OK, I'll do up a patch.

References:
- GIN error trace patch on pgsql-general
http://archives.postgresql.org/pgsql-general/2007-06/msg00072.php
- Committed fixes
http://archives.postgresql.org/pgsql-committers/2007-06/msg00061.php

--
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com

Re: how is pitr replay interruption time determined?

От

Robert Treat

Дата:

29 августа 2007 г., 14:46:42

On Tuesday 28 August 2007 18:49, Tom Lane wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
> > Yeh, we traced a problem with GIN indexes to this cause in early June;
> > Teodor fixed it quickly in REL8_2_STABLE, but that won't be available
> > until 8.2.5.
>
> Thanks, that's what I missed finding ... Robert, do you have any GIN
> indexes in this DB?
>

Ah, I didn't even think about this, but yes, one of the dbs in the cluster (a
small, low traffic, and generally forgotten about one)  does have a couple of
GIN indexes in it.

--
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

Re: how is pitr replay interruption time determined?

От

Robert Treat

Дата:

29 августа 2007 г., 14:51:39

On Tuesday 28 August 2007 18:44, Simon Riggs wrote:
> On Tue, 2007-08-28 at 17:59 -0400, Tom Lane wrote:
> > Robert Treat <xzilla@users.sourceforge.net> writes:
> > > Is there some way to force checkpoints on a db doing wal replay?
> >
> > No, it's hardwired to do it when it sees a checkpoint record in the WAL
> > stream.
> >
> > > pg_control last modified:             Mon Aug 27 12:12:55 2007
> > > Time of latest checkpoint:            Mon Jul 30 19:17:37 2007
> >
> > After looking again at the code, the "last modified" time is the time
> > that a recovery checkpoint was last done, and the "latest checkpoint"
> > is the timestamp of the WAL-stream checkpoint record that triggered it.
> > In a situation where you're catching up on historical WAL they could be
> > far apart, but when a slave is just following the master there shouldn't
> > be a huge difference --- not more than the maximum time to fill a WAL
> > record and ship it over to the slave, for sure.
> >
> > (BTW, I misread it before --- it looks like the "at log time" value
> > printed at startup *is* taken from the checkpoint record that it's
> > trying to roll forward from.)
>
> That's correct. Sorry for not replying earlier; just back from hols.
>
> Jumping back to original thought: Robert, you should be using the last
> checkpoint location, not the last time to decide which xlogs to remove.
>

Looking at last checkpoint location in pg_control, I see:
Latest checkpoint location:           1C/8001E848

How does one translate that into an xlog file name?

--
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

Re: how is pitr replay interruption time determined?

От

"Kevin Grittner"

Дата:

29 августа 2007 г., 16:02:21

>>> On Wed, Aug 29, 2007 at  9:48 AM, in message
<200708291048.54662.xzilla@users.sourceforge.net>, Robert Treat
<xzilla@users.sourceforge.net> wrote:
> Looking at last checkpoint location in pg_control, I see:
> Latest checkpoint location:           1C/8001E848
>
> How does one translate that into an xlog file name?

Unless you've gotten fancy in a PITR recovery, it should be:

000000010000001C00000080

01E848 is the offset into the file.

-Kevin

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: how is pitr replay interruption time determined?