Обсуждение: data loss with pg_standby when doing a controlled failover

Поиск
Список
Период
Сортировка

data loss with pg_standby when doing a controlled failover

От
Andreas Pflug
Дата:
Running 8.3.7, I have a warm standby configuration with a
archive_timeout of 10min.

It's obvious that there's a 10min period where data could be lost if the
master fails and the warm standby server has to take over. What's not
obvious is that this is true even if the master server is shut down
regularly, because it will not write out a last log segment to the
archive. As a consequence, when doing a controlled failover (for
maintenance purposes or so) all data changed after the last archive copy
will be lost.
IMHO this should be mentioned in the docs explicitly (I find it quite
surprising that data can be lost even if the system is shutdown
correctly), or better when shutting down the postmaster should spit all
log segments containing all changes when archiving is on so the warm
standby server can catch up.

Regards.
Andreas

Re: data loss with pg_standby when doing a controlled failover

От
Scott Mead
Дата:
On Mon, Apr 6, 2009 at 7:37 AM, Andreas Pflug <pgadmin@pse-consulting.de>wrote:

> Running 8.3.7, I have a warm standby configuration with a
> archive_timeout of 10min.
>
> It's obvious that there's a 10min period where data could be lost if the
> master fails and the warm standby server has to take over. What's not
> obvious is that this is true even if the master server is shut down
> regularly, because it will not write out a last log segment to the
> archive. As a consequence, when doing a controlled failover (for
> maintenance purposes or so) all data changed after the last archive copy
> will be lost.
> IMHO this should be mentioned in the docs explicitly (I find it quite
> surprising that data can be lost even if the system is shutdown
> correctly), or better when shutting down the postmaster should spit all
> log segments containing all changes when archiving is on so the warm
> standby server can catch up.



You make an excellent point.  If you're looking for a way to mitigate this
risk, run:

    select pg_switch_xlog() ;

   Before shutting down.

--Scott

>
>
> Regards.
> Andreas
>
> --
> Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-bugs
>

Re: data loss with pg_standby when doing a controlled failover

От
Guillaume Smet
Дата:
On Mon, Apr 6, 2009 at 1:37 PM, Andreas Pflug <pgadmin@pse-consulting.de> wrote:
> IMHO this should be mentioned in the docs explicitly (I find it quite
> surprising that data can be lost even if the system is shutdown
> correctly), or better when shutting down the postmaster should spit all
> log segments containing all changes when archiving is on so the warm
> standby server can catch up.

See also this thread which might be interesting for you:
http://archives.postgresql.org/message-id/3f0b79eb0903242329j12865d55s348f5c873a956e71@mail.gmail.com

--
Guillaume

Re: data loss with pg_standby when doing a controlled failover

От
Andreas Pflug
Дата:
Scott Mead wrote:
>
>
>
> On Mon, Apr 6, 2009 at 7:37 AM, Andreas Pflug
> <pgadmin@pse-consulting.de <mailto:pgadmin@pse-consulting.de>> wrote:
>
>     Running 8.3.7, I have a warm standby configuration with a
>     archive_timeout of 10min.
>
>     It's obvious that there's a 10min period where data could be lost
>     if the
>     master fails and the warm standby server has to take over. What's not
>     obvious is that this is true even if the master server is shut down
>     regularly, because it will not write out a last log segment to the
>     archive. As a consequence, when doing a controlled failover (for
>     maintenance purposes or so) all data changed after the last
>     archive copy
>     will be lost.
>     IMHO this should be mentioned in the docs explicitly (I find it quite
>     surprising that data can be lost even if the system is shutdown
>     correctly), or better when shutting down the postmaster should
>     spit all
>     log segments containing all changes when archiving is on so the warm
>     standby server can catch up.
>
>
>
> You make an excellent point.  If you're looking for a way to mitigate
> this risk, run:
>
>     select pg_switch_xlog() ;
>
>    Before shutting down.
Sort of, unless some other user succeeds to commit a transaction after
pg_switch_xlog, and before the database ceases operation.

My "graceful failover" procedure now includes this workaround:
- shutdown server
- restart server with --listen_addresses='' to prevent other users to
connect (there are no local users on the server machine)
- pg_switch_xlog()
- shutdown finally
- let the warm server continue

Regards,
Andreas

Re: data loss with pg_standby when doing a controlled failover

От
Andreas Pflug
Дата:
Guillaume Smet wrote:
> On Mon, Apr 6, 2009 at 1:37 PM, Andreas Pflug <pgadmin@pse-consulting.de> wrote:
>
>> IMHO this should be mentioned in the docs explicitly (I find it quite
>> surprising that data can be lost even if the system is shutdown
>> correctly), or better when shutting down the postmaster should spit all
>> log segments containing all changes when archiving is on so the warm
>> standby server can catch up.
>>
>
> See also this thread which might be interesting for you:
> http://archives.postgresql.org/message-id/3f0b79eb0903242329j12865d55s348f5c873a956e71@mail.gmail.com
>
It is, though not related to this problem. I'd expect pg_standby's
default behaviour to be like the pseudocode's in the warm-standby
documentation. To me, it's kind of unexpected that it won't continue
restoring if the trigger file is present (as Kevin said, what's the use
case for the current behaviour?).

Regards,
Andreas

Re: data loss with pg_standby when doing a controlled failover

От
Fujii Masao
Дата:
Hi,

On Mon, Apr 6, 2009 at 11:13 PM, Andreas Pflug
<pgadmin@pse-consulting.de> wrote:
> Scott Mead wrote:
>>
>>
>>
>> On Mon, Apr 6, 2009 at 7:37 AM, Andreas Pflug
>> <pgadmin@pse-consulting.de <mailto:pgadmin@pse-consulting.de>> wrote:
>>
>> =A0 =A0 Running 8.3.7, I have a warm standby configuration with a
>> =A0 =A0 archive_timeout of 10min.
>>
>> =A0 =A0 It's obvious that there's a 10min period where data could be lost
>> =A0 =A0 if the
>> =A0 =A0 master fails and the warm standby server has to take over. What'=
s not
>> =A0 =A0 obvious is that this is true even if the master server is shut d=
own
>> =A0 =A0 regularly, because it will not write out a last log segment to t=
he
>> =A0 =A0 archive. As a consequence, when doing a controlled failover (for
>> =A0 =A0 maintenance purposes or so) all data changed after the last
>> =A0 =A0 archive copy
>> =A0 =A0 will be lost.
>> =A0 =A0 IMHO this should be mentioned in the docs explicitly (I find it =
quite
>> =A0 =A0 surprising that data can be lost even if the system is shutdown
>> =A0 =A0 correctly), or better when shutting down the postmaster should
>> =A0 =A0 spit all
>> =A0 =A0 log segments containing all changes when archiving is on so the =
warm
>> =A0 =A0 standby server can catch up.
>>
>>
>>
>> You make an excellent point. =A0If you're looking for a way to mitigate
>> this risk, run:
>>
>> =A0 =A0 select pg_switch_xlog() ;
>>
>> =A0 =A0Before shutting down.
> Sort of, unless some other user succeeds to commit a transaction after
> pg_switch_xlog, and before the database ceases operation.
>
> My "graceful failover" procedure now includes this workaround:
> - shutdown server
> - restart server with --listen_addresses=3D'' to prevent other users to
> connect (there are no local users on the server machine)
> - pg_switch_xlog()
> - shutdown finally
> - let the warm server continue

What if new xlogs are generated by autovacuum or bgwriter
between pg_switch_xlog and final shutdown? Those xlogs
can be ignored?

Regards,

--=20
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center