Re: [Skytools-users] WAL Shipping + checkpoint
От | Mark Kirkwood |
---|---|
Тема | Re: [Skytools-users] WAL Shipping + checkpoint |
Дата | |
Msg-id | 4A95B499.6030803@catalyst.net.nz обсуждение исходный текст |
Ответ на | Re: [Skytools-users] WAL Shipping + checkpoint (Sébastien Lardière <slardiere@hi-media.com>) |
Ответы |
Re: [Skytools-users] WAL Shipping + checkpoint
|
Список | pgsql-general |
Sébastien Lardière wrote: > On 26/08/2009 04:46, Mark Kirkwood wrote: >> Sébastien Lardière wrote: >>> Hi All, >>> >>> I've a cluster ( Pg 8.3.7 ) with WAL Shipping, and a few hours ago, >>> the master had to restart. >>> >>> I use walmgr from Skytools, which works very well. >>> >>> I have already restart the master without any problem, but today, >>> the slave doesn't work like I want. The field "Time of latest >>> checkpoint" from the pg_controldata on the slave keep the same >>> values, but WAL File are processed correctly. >>> >>> I try to restart the slave, but, after processed again all the WAL >>> between "Time of latest checkpoint" and, it does nothing else, >>> latest checkpoint stay at the same value. >>> >>> I don't know if it's important ( i think so ), and I can't fix it. >>> >> It is normal for it to lag behind somewhat on the slave (depending on >> what your checkpoint timeout etc settings are). >> >> However, I've noticed what you are seeing as well - particularly when >> there are no actual data changes coming through in the logs - the >> slave checkpoint time does not change even tho there have been >> checkpoints on the master (I may have a look in the code to see what >> the story really is...if I have time). >> > > Yes, but the delay between the last checkpoint on the master and the > slave is very high, now ( 100 000 sec ), because the last checkpoint > on the slave was yesterday ( as far as pg_controldata is right ) > > Here a graph from our munin plugin : > http://seb.ouvaton.org/tmp/bdd-pg_walmgr-week.png > > The blue line represent an average between two WAL processed on the > slave, and the green line, the delai between last checkpoint on the > master and the slave. > > Maybe it's not some good indicator, but the green line let me think > there is problem. > > Do you have archive_timeout set? If so, then what *could* be happening is this: There are actually no "real" data changes being made on your master for some reason. So every time archive_timeout is reached a log full of no changes is shipped to your slave and applied - and no checkpoint times are changed for reasons I mentioned above. A way to test the would be to do something that makes real data changes in the master. A good thing to try would be to: - create a new database - create tables and add some reasonable amount of data (e.g. initialized pgbench scale 100). Then see if your checkpoint time gets updated a few minutes or so later.
В списке pgsql-general по дате отправления: