Re: [BUG] pg_basebackup from disconnected standby fails

Поиск
Список
Период
Сортировка
От Kyotaro HORIGUCHI
Тема Re: [BUG] pg_basebackup from disconnected standby fails
Дата
Msg-id 20160615.121812.197115872.horiguchi.kyotaro@lab.ntt.co.jp
обсуждение исходный текст
Ответ на Re: [BUG] pg_basebackup from disconnected standby fails  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Список pgsql-hackers
Sorry, I'm confused about the minRecoveryPoint.

Reconsidered a bit.

At Tue, 14 Jun 2016 20:31:11 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote in
<20160614.203111.229211034.horiguchi.kyotaro@lab.ntt.co.jp>
> > > After looking more closely, I found that the minRecoveryPoint
> > > tends to be too small as the backup end point, and up to the
> > > record at the lastReplayedRecPtr can affect the pages on disk and
> > > they can go into the backup just taken.
> > >
> > > My conclusion here is that do_pg_stop_backup should return
> > > lastReplayedRecPtr, not minRecoveryPoint.
> > 
> > I have been thinking quite a bit about this patch, and this logic
> > sounds quite right to me. When stopping the backup we need to let the
> > user know up to which point it needs to replay WAL, and relation pages
> > are touched up to lastReplayedEndRecPtr.
> 
> Yes, but by design, the changes in buffers don't go into disk
> until buffer replacing occurs, which updates minRecoveryPoint. My
> understanding is that the problem is that a restart point that is
> not accompanied with buffer updates advances only the redo point
> of the last checkpoint and doesn't update minRecoveryPoint, which
> may be behind the redo point at the time.
> 
> It seems to me that we could more agressively advance the
> minRecoveryPoint (but must not let it go too far..), but it is
> right for it to aim a bit smaller than the ideal location.

It's wrong. minRecoveryPoint should be greater than or equal to
the maximum buffer-touching LSN reached in previous recoveries,
and it can be the same to replayEndRecPtr but may be behind it if
no acutual modification on page files is done hereafter. xlog.c
works that way. The value of the minRecoveryPoint smaller than
the redo point of the last checkpoint with no buffer flush is
allowable from this point of view but it is not proper for the
end point of a backup.

If we skip recording the last checkpoint position when it
eventually causes no buffer flush, minRecoveryPoint is again
usable for the purpose. However, it causes repeated restartpoint
trial on the same (skipped) checkpoint record.

As the consequence, we can solve this problemn also by explicitly
updating the minRecoveryPoint for an executed restartpoint
without no buffer flush.

The attached patch performs this way and also solves the problem.

Which one do you think is more preferable? Or any other solution?

This patch updates minRecoeryPoint only for restartpoints that
caused no buffer flush but such restriction might not be
necessary.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tom Lane
Дата:
Сообщение: Re: increase message string buffer size of watch command of psql
Следующее
От: Amit Kapila
Дата:
Сообщение: Re: parallel.c is not marked as test covered