Weird spikes in delay for async streaming replication on 9.1
От | David F. Skoll |
---|---|
Тема | Weird spikes in delay for async streaming replication on 9.1 |
Дата | |
Msg-id | 20150214110050.2f032f05@ollie.roaringpenguin.com обсуждение исходный текст |
Ответы |
Re: Weird spikes in delay for async streaming replication on 9.1
|
Список | pgsql-admin |
Hi, I have a two-database cluster. The machines are geographically separated and the nature of my application is that many read-only queries can tolerate being "behind the times" by a few seconds. So machines near the hot-standby connect to the hot-standby for these delay-tolerant queries in order to reduce traffic over the relatively slow link between geographical locations. I have a monitoring script that tests the actual delay for a transaction on the master to appear on the hot-standby. Every few minutes, my script runs an update on the master and then sits in a loop checking how long it takes to appear on the hot-standby. 99% of the time, it's less than a second. But every once in a while, the time spikes dramatically, to hundreds or thousands of seconds, and that's too long... the delay-tolerant queries are not *that* delay-tolerant, so we switch to sending them all to the master. See the graph: http://ibin.co/1rdm4ekiWmpM I've tried to figure out what causes this, and the only events I can find that correlate are a pg_dump on the master and possibly some autovacuum jobs kicking off. So my questions: 1) Can a long-running transaction on the master block subsequent transactions from being consumed on the hot-standby, or am I totally out to lunch? 2) If (1) is correct, is it still true in 9.4? 3) If (1) is false, does anyone have plausibly explanations for what I'm seeing? I don't think it's the link between the sites, because we also monitor that and it seems to be fine. Regards, David.
В списке pgsql-admin по дате отправления: