Re: why can the isolation tester handle only one waiting process?
От | Robert Haas |
---|---|
Тема | Re: why can the isolation tester handle only one waiting process? |
Дата | |
Msg-id | CA+TgmoaeRPfXMRgZJO-pxa+-sggE-ofUCTxpNGcOz9ckE5KfGw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: why can the isolation tester handle only one waiting process? (Robert Haas <robertmhaas@gmail.com>) |
Ответы |
Re: why can the isolation tester handle only one waiting process?
|
Список | pgsql-hackers |
On Mon, Aug 17, 2015 at 5:40 PM, Robert Haas <robertmhaas@gmail.com> wrote: > Good idea. Here's an updated patch series that takes that approach. > It cancels any query after 60 seconds of waiting, and if the query > doesn't respond to the cancel, then it bails out completely after 75 > seconds (i.e. 15 seconds after attempting the cancel). Here's an updated patch series with some more improvements to the isolationtester code, and some better test cases. I now have a test for (a) a "simple" deadlock, involving a lock upgrade scenario where the process seeking the upgrade must jump the wait queue; (b) a hard deadlock; and (c) a soft deadlock that can be resolved by reordering the wait queue. According to lcov this tests most of deadlock.c: 10 of 11 functions (not GetBlockingAutoVacuumPgproc), and 246 of 291 lines. That's clearly an improvement over the status quo, but I'm having a hard time feeling happy about it, because it's really only testing the easy cases. I can't construct a case where reversing any one single soft edge doesn't immediately resolve the deadlock (see end of TestConfigurationRecurse); the first one tried always works. Moving a process that would otherwise deadlock ahead of conflicting waiters seems to be an extremely effective way of resolving deadlocks. For it to fail, reversing one of the edges in the waits-for graph must create a new cycle. But it seems to be quite hard for that to actually happen: the new edge that is created after the reversal points to the guy that got skipped ahead in the queue. For that reversed edge to be part of a cycle, the queue-skipping process has to be directly or indirectly waiting for some other process he jumped over. But, clearly, he's only waiting for processes that are *still ahead* of him in the queue, and he would have had to wait for those processes whether he'd skipped ahead in the queue or not. So perhaps a test case here would involve a process that skips forward in the lock queue, but not far enough? But I haven't been able to figure it out. I also can't construct a test case where ExpandConstraints returns false (see TestConfiguration); the wait orderings it generates are always self-consistent. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Вложения
В списке pgsql-hackers по дате отправления: