Re: streaming replication master can fail to shut down
От | Nick Cleaton |
---|---|
Тема | Re: streaming replication master can fail to shut down |
Дата | |
Msg-id | CAFgz3ku0_B8g56kJ+NWQZsqcbP-+DKgAGH9WTjmUQT2BFMG2jQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: streaming replication master can fail to shut down (Andres Freund <andres@anarazel.de>) |
Ответы |
Re: streaming replication master can fail to shut down
|
Список | pgsql-bugs |
On 29 April 2016 at 04:38, Andres Freund <andres@anarazel.de> wrote: >> > I guess you have a fair amount of WAL traffic, and the receiver was >> > behind a good bit? >> >> No, IIRC this was on the test cluster that I installed for the purpose >> of replicating the problem under 9.5; it was essentially idle. > > The reason I'm asking is that I so far can't really replicate the issue > so far. It's pretty clear that waiting_for_ping_response = true; is > needed, but I'm suspicious that that's not all. > > Was your standby on a separate machine? Yes, I've only seen it happen when the standby was on a machine with slower CPU cores than the primary. All my attempts to replicate it on a single machine by trying to slow down the wal receiver have failed. I'm fairly convinced it's some sort of race that depends on wal sender + network being faster than wal receiver. > What kind of latency? 1G switches. root@XXX:~# ping XXX PING XXX) 56(84) bytes of data. 64 bytes from XXX: icmp_seq=1 ttl=64 time=0.162 ms 64 bytes from XXX: icmp_seq=2 ttl=64 time=0.223 ms 64 bytes from XXX: icmp_seq=3 ttl=64 time=0.122 ms 64 bytes from XXX: icmp_seq=4 ttl=64 time=0.126 ms 64 bytes from XXX: icmp_seq=5 ttl=64 time=0.149 ms
В списке pgsql-bugs по дате отправления: