On 11.02.2011 22:11, Robert Haas wrote:
> On Fri, Feb 11, 2011 at 2:02 PM, Daniel Farina<drfarina@acm.org> wrote:
>> I split this out of the synchronous replication patch for independent
>> review. I'm dashing out the door, so I haven't put it on the CF yet or
>> anything, but I just wanted to get it out there...I'll be around in
>> Not Too Long to finish any other details.
>
> This looks like a useful and separately committable change.
Hmm, so this patch implements a watchdog, where the master disconnects
the standby if the heartbeat from the standby stops for more than
'replication_[server]_timeout' seconds. The standby sends the heartbeat
every wal_receiver_status_interval seconds.
It would be nice if the master and standby could negotiate those
settings. As the patch stands, it's easy to have a pathological
configuration where replication_server_timeout <
wal_receiver_status_interval, so that the master repeatedly disconnects
the standby because it doesn't reply in time. Maybe the standby should
report how often it's going to send a heartbeat, and master should wait
for that long + some safety margin. Or maybe the master should tell the
standby how often it should send the heartbeat?
-- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com