Re: Replication server timeout patch

Поиск

Список

Период

Сортировка

От	Robert Haas
Тема	Re: Replication server timeout patch
Дата	17 февраля 2011 г. 23:11:06
Msg-id	AANLkTim-Q1VVr9DGPsMh=p4VApSZ3Y=1QQoSDcFjCyvU@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Replication server timeout patch (Fujii Masao <masao.fujii@gmail.com>)
Ответы	Re: Replication server timeout patch
Список	pgsql-hackers

Дерево обсуждения

On Thu, Feb 17, 2011 at 9:10 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Fri, Feb 18, 2011 at 7:55 AM, Josh Berkus <josh@agliodbs.com> wrote:
>>> So, in summary, the position is that we have a timeout, but that timeout
>>> doesn't work in all cases. But it does work in some, so that seems
>>> enough for me to say "let's commit". Not committing gives us nothing at
>>> all, which is as much use as a chocolate teapot.
>>
>> Can someone summarize the cases where it does and doesn't work?
>> There's been a longish gap in this thread.
>
> The timeout doesn't work when walsender gets blocked during sending the
> WAL because the send buffer has been filled up, I'm afraid. IOW, it doesn't
> work when the standby becomes unresponsive while WAL is generated on
> the master one after another. Since walsender tries to continue sending the
> WAL while the standby is unresponsive, the send buffer gets filled up and
> the blocking send function (e.g., pq_flush) blocks the walsender.
>
> OTOH, if the standby becomes unresponsive when there is no workload
> which causes WAL, the timeout would work.

IMHO, that's so broken as to be useless.

I would really like to have a solution to this problem, though.
Relying on TCP keepalives is weak.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Replication server timeout patch