Re: Issues with Quorum Commit
От | Simon Riggs |
---|---|
Тема | Re: Issues with Quorum Commit |
Дата | |
Msg-id | 1286571909.2304.1026.camel@ebony обсуждение исходный текст |
Ответ на | Re: Issues with Quorum Commit (Greg Smith <greg@2ndquadrant.com>) |
Список | pgsql-hackers |
On Fri, 2010-10-08 at 16:34 -0400, Greg Smith wrote: > Tom Lane wrote: > > How are you going to "mark the standby as degraded"? The > > standby can't keep that information, because it's not even connected > > when the master makes the decision. > > From a high level, I'm assuming only that the master has a list in > memory of the standby system(s) it believes are up to date, and that it > is supposed to commit to synchronously. When I say mark as degraded, I > mean that the master merely closes whatever communications channel it > had open with that system and removes the standby from that list. My current coding works with two sets of parameters: The "master marks standby as degraded" is handled by the tcp keepalives. When it notices no response, it kicks out the standby. We already had this, so I never mentioned it before as being part of the solution. The second part is the synchronous_replication_timeout which is a user settable parameter defining how long the app is prepared to wait, which could be more or less time than the keepalives. > If that standby now reconnects again, I don't see how resolving what > happens at that point is any different from when a standby is first > started after both systems were turned off. If the standby is current > with the data available on the master when it has an initial > conversation, great; it's now available for synchronous commit too > then. If it's not, it goes into a catchup mode first instead. When the > master sees you're back to current again, if you're on the list of sync > servers too you go back onto the list of active sync systems. > > There's shouldn't be any state information to save here. If the master > and standby can't figure out if they are in or out of sync with one > another based on the conversation they have when they first connect to > one another, that suggests to me there needs to be improvements made in > the communications protocol they use to exchange messages. Agreed. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Development, 24x7 Support, Training and Services
В списке pgsql-hackers по дате отправления: