Обсуждение: Checkpoints vs restartpoints
Hi Why do standby servers not simply treat every checkpoint as a restartpoint? As I understand it, setting checkpoint_timeout and checkpoint_segments higher on a standby server effectively instruct standby servers to skip some checkpoints. Even with the same settings on both servers, the server could still choose to skip a checkpoint near the checkpoint_timeout limit due to the vagaries of time keeping (though I suppose it's very unlikely). But what could the advantage of skipping checkpoints be? Do people deliberately set hot standby machines up like this to trade a longer crash recover time for lower write IO? I was wondering about this in the context of the recent multixact work, since such configurations could leave you with different SLRU files on disk which in some versions might change the behaviour in interesting ways. -- Thomas Munro http://www.enterprisedb.com
On Tue, Jun 9, 2015 at 4:20 PM, Thomas Munro <thomas.munro@enterprisedb.com> wrote:
Hi
Why do standby servers not simply treat every checkpoint as a
restartpoint? As I understand it, setting checkpoint_timeout and
checkpoint_segments higher on a standby server effectively instruct
standby servers to skip some checkpoints. Even with the same settings
on both servers, the server could still choose to skip a checkpoint
near the checkpoint_timeout limit due to the vagaries of time keeping
(though I suppose it's very unlikely). But what could the advantage
of skipping checkpoints be? Do people deliberately set hot standby
machines up like this to trade a longer crash recover time for lower
write IO?
When a hot standby server is initially being set up using a rather old base backup and an archive directory, it could be applying WAL at a very high rate such that it would replay master checkpoints multiple times a second (when the master has long periods with little write activity and has checkpoints driven by timeouts during those periods). Actually doing restartpoints that often could be annoying. Presumably there would be few dirty buffers to write out, since each checkpoint saw little activity, but you would still have to circle the shared_buffers twice, and fsync whichever files did happen to get some changes.
Cheers,
Jeff
On Tue, Jun 9, 2015 at 05:20:23PM -0700, Jeff Janes wrote: > On Tue, Jun 9, 2015 at 4:20 PM, Thomas Munro <thomas.munro@enterprisedb.com> > wrote: > > Hi > > Why do standby servers not simply treat every checkpoint as a > restartpoint? As I understand it, setting checkpoint_timeout and > checkpoint_segments higher on a standby server effectively instruct > standby servers to skip some checkpoints. Even with the same settings > on both servers, the server could still choose to skip a checkpoint > near the checkpoint_timeout limit due to the vagaries of time keeping > (though I suppose it's very unlikely). But what could the advantage > of skipping checkpoints be? Do people deliberately set hot standby > machines up like this to trade a longer crash recover time for lower > write IO? > > > When a hot standby server is initially being set up using a rather old base > backup and an archive directory, it could be applying WAL at a very high rate > such that it would replay master checkpoints multiple times a second (when the > master has long periods with little write activity and has checkpoints driven > by timeouts during those periods). Actually doing restartpoints that often > could be annoying. Presumably there would be few dirty buffers to write out, > since each checkpoint saw little activity, but you would still have to circle > the shared_buffers twice, and fsync whichever files did happen to get some > changes. Ah, so even thought standbys don't have to write WAL, they are fsyncing shared buffers. Where is the restart point recorded, in pg_controldata? c -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + Everyone has their own god. +
On Wed, Jun 10, 2015 at 9:33 AM, Bruce Momjian wrote: > Ah, so even thought standbys don't have to write WAL, they are fsyncing > shared buffers. Where is the restart point recorded, in pg_controldata? > c Yep. Latest checkpoint's REDO location, or ControlFile->checkPointCopy.redo. During recovery, a copy is kept as well in XLogCtlData.lastCheckPoint. -- Michael
On 2015-06-10 11:20:19 +1200, Thomas Munro wrote: > I was wondering about this in the context of the recent multixact > work, since such configurations could leave you with different SLRU > files on disk which in some versions might change the behaviour in > interesting ways. Note that trigger a restartpoint everytime a checkpoint is replayed wouldn't realistically fix this. Restartpoints are performed in the background (the checkpointer), not in the startup process itself. Not doing that would be prohibitive performance wise, because each checkpoint would stop replication progress for seconds to tens of minutes. - Andres