Re: streaming replication master can fail to shut down

Поиск
Список
Период
Сортировка
От Nick Cleaton
Тема Re: streaming replication master can fail to shut down
Дата
Msg-id CAFgz3ku0_B8g56kJ+NWQZsqcbP-+DKgAGH9WTjmUQT2BFMG2jQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: streaming replication master can fail to shut down  (Andres Freund <andres@anarazel.de>)
Ответы Re: streaming replication master can fail to shut down  (Andres Freund <andres@anarazel.de>)
Список pgsql-bugs
On 29 April 2016 at 04:38, Andres Freund <andres@anarazel.de> wrote:

>> > I guess you have a fair amount of WAL traffic, and the receiver was
>> > behind a good bit?
>>
>> No, IIRC this was on the test cluster that I installed for the purpose
>> of replicating the problem under 9.5; it was essentially idle.
>
> The reason I'm asking is that I so far can't really replicate the issue
> so far. It's pretty clear that waiting_for_ping_response = true; is
> needed, but I'm suspicious that that's not all.
>
> Was your standby on a separate machine?

Yes, I've only seen it happen when the standby was on a machine with
slower CPU cores than the primary. All my attempts to replicate it on
a single machine by trying to slow down the wal receiver have failed.
I'm fairly convinced it's some sort of race that depends on wal sender
+ network being faster than wal receiver.

> What kind of latency?

1G switches.

root@XXX:~# ping XXX
PING XXX) 56(84) bytes of data.
64 bytes from XXX: icmp_seq=1 ttl=64 time=0.162 ms
64 bytes from XXX: icmp_seq=2 ttl=64 time=0.223 ms
64 bytes from XXX: icmp_seq=3 ttl=64 time=0.122 ms
64 bytes from XXX: icmp_seq=4 ttl=64 time=0.126 ms
64 bytes from XXX: icmp_seq=5 ttl=64 time=0.149 ms

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: streaming replication master can fail to shut down
Следующее
От: Magnus Hagander
Дата:
Сообщение: Re: streaming replication master can fail to shut down