Re: Replication server timeout patch
От | Robert Haas |
---|---|
Тема | Re: Replication server timeout patch |
Дата | |
Msg-id | AANLkTim-Q1VVr9DGPsMh=p4VApSZ3Y=1QQoSDcFjCyvU@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Replication server timeout patch (Fujii Masao <masao.fujii@gmail.com>) |
Ответы |
Re: Replication server timeout patch
|
Список | pgsql-hackers |
On Thu, Feb 17, 2011 at 9:10 PM, Fujii Masao <masao.fujii@gmail.com> wrote: > On Fri, Feb 18, 2011 at 7:55 AM, Josh Berkus <josh@agliodbs.com> wrote: >>> So, in summary, the position is that we have a timeout, but that timeout >>> doesn't work in all cases. But it does work in some, so that seems >>> enough for me to say "let's commit". Not committing gives us nothing at >>> all, which is as much use as a chocolate teapot. >> >> Can someone summarize the cases where it does and doesn't work? >> There's been a longish gap in this thread. > > The timeout doesn't work when walsender gets blocked during sending the > WAL because the send buffer has been filled up, I'm afraid. IOW, it doesn't > work when the standby becomes unresponsive while WAL is generated on > the master one after another. Since walsender tries to continue sending the > WAL while the standby is unresponsive, the send buffer gets filled up and > the blocking send function (e.g., pq_flush) blocks the walsender. > > OTOH, if the standby becomes unresponsive when there is no workload > which causes WAL, the timeout would work. IMHO, that's so broken as to be useless. I would really like to have a solution to this problem, though. Relying on TCP keepalives is weak. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления: