Re: streaming replication breaks horribly if master crashes
От | Josh Berkus |
---|---|
Тема | Re: streaming replication breaks horribly if master crashes |
Дата | |
Msg-id | 4C19309C.1090703@agliodbs.com обсуждение исходный текст |
Ответ на | streaming replication breaks horribly if master crashes (Robert Haas <robertmhaas@gmail.com>) |
Ответы |
Re: streaming replication breaks horribly if master crashes
|
Список | pgsql-hackers |
> The first problem I noticed is that the slave never seems to realize > that the master has gone away. Every time I crashed the master, I had > to kill the wal receiver process on the slave to get it to reconnect; > otherwise it just sat there waiting, either forever or at least for > longer than I was willing to wait. Yes, I've noticed this. That was the reason for forcing walreceiver to shut down on a restart per prior discussion and patches. This needs to be on the open items list ... possibly it'll be fixed by Simon's keepalive patch? Or is it just a tcp_keeplalive issue? > More seriously, I was able to demonstrate that the problem linked in > the thread above is real: if the master crashes after streaming WAL > that it hasn't yet fsync'd, then on recovery the slave's xlog position > is ahead of the master. So far I've only been able to reproduce this > with fsync=off, but I believe it's possible anyway, ... and some users will turn fsync off. This is, in fact, one of the primary uses for streaming replication: Durability via replicas. > and this just > makes it more likely. After the most recent crash, the master thought > pg_current_xlog_location() was 1/86CD4000; the slave thought > pg_last_xlog_receive_location() was 1/8733C000. After reconnecting to > the master, the slave then thought that > pg_last_xlog_receive_location() was 1/87000000. So, *in this case*, detecting out-of-sequence xlogs (and PANICing) would have actually prevented the slave from being corrupted. My question, though, is detecting out-of-sequence xlogs *enough*? Are there any crash conditions on the master which would cause the master to reuse the same locations for different records, for example? I don't think so, but I'd like to be certain. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com
В списке pgsql-hackers по дате отправления: