Re: termination of backend waiting for sync rep generates a junk log message
От | Tom Lane |
---|---|
Тема | Re: termination of backend waiting for sync rep generates a junk log message |
Дата | |
Msg-id | 27332.1319398399@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: termination of backend waiting for sync rep generates a junk log message (Robert Haas <robertmhaas@gmail.com>) |
Список | pgsql-hackers |
Robert Haas <robertmhaas@gmail.com> writes: > On Tue, Oct 18, 2011 at 11:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> One thing worth asking is why we're willing to violate half a dozen >> different coding rules if we see ProcDiePending, yet we're perfectly >> happy to rely on the client understanding a WARNING for the >> QueryCancelPending case. �Another is whether this whole function isn't >> complete BS in the first place, since it appears to be coded on the >> obviously-false assumption that nothing it calls can throw elog(ERROR) >> --- and of course, if any of those functions do throw ERROR, all the >> argumentation here goes out the window. > Well, there is a general problem that anything which throws an ERROR > too late in the commit path is Evil; and sync rep makes that worse to > the extent that it adds more stuff late in the commit path, but it > didn't invent the problem. What it did do is add stuff late in the > commit path that can block for a potentially unbounded period of time, > and I don't see that there are any solutions to that problem that > aren't somewhat grotty. After further reflection, you're right that all sync rep is really doing is extending the time duration of the interval wherein clients will have a hard time telling whether the commit occurred or not. It's always been the case that if a cancel/die interrupt occurs during CommitTransaction, that will get serviced at the RESUME_INTERRUPTS call at the end, and the client will see an apparent failure even though the transaction was committed. Even without that, an interrupt occurring just after this code sequence, but before we reach the point of sending a command-complete response message, is going to result in client confusion, and there's very little we can do about that. I think what we should do in SyncRepWaitForLSN is just send a warning and abandon waiting. Trying to fool with the interrupt response behavior beyond that is simply broken, and it doesn't help any that we chose to break it in two different, but equally indefensible, ways for cancel versus die interrupts. It would help BTW for the warning to have its own SQLSTATE, if we're imagining that "some clients may be able to interpret" it. Also, this code is supposing that it must be called within a HOLD_INTERRUPTS context, but it doesn't look to me like that is being done for the various calls from twophase.c. regards, tom lane
В списке pgsql-hackers по дате отправления: