Re: pg_terminate_backend() issues
От | Tom Lane |
---|---|
Тема | Re: pg_terminate_backend() issues |
Дата | |
Msg-id | 10589.1208364398@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: pg_terminate_backend() issues (Magnus Hagander <magnus@hagander.net>) |
Ответы |
Re: pg_terminate_backend() issues
|
Список | pgsql-hackers |
Magnus Hagander <magnus@hagander.net> writes: > Tom Lane wrote: >> I'm willing to enable a SIGTERM-based pg_terminate_backend for 8.4 >> if there is some reasonable amount of testing done during this >> development cycle to try to expose any problems. > If someone can come up with an automated script to do this kind of > testing, I can commit a VM or three to running this 24/7 for a month, > easily... But I don't trust myself in coming up with a test-case that's > good enough :-P The closest thing I can think of to an automated test is to run repeated sets of the parallel regression tests, and each time SIGTERM a randomly chosen backend at a randomly chosen time. Then see if anything "funny" happens. The hard part here is distinguishing expected from unexpected regression outputs, especially in view of the fact that some of the tests depend on database contents set up by earlier tests. I'm thinking that you could automatically discard the regression diff for the specific test that got SIGTERM'd, as long as it looked like the normal output up to the point where the "terminated by administrator" error appears. Then what you'd have is the potential for downstream failures due to things not being created, which *should* fall into a fairly stylized set of possible diffs. So get the script to throw away any diffs that exactly match ones seen previously. Run it for awhile, and then hand-validate the set of diffs that it's saved ... or if any of 'em look funny, report. One gotcha I can think of is that killing the prepared_xacts test can leave you with open 2PC transactions, which will interfere with starting the next cycle of the tests (you have to kill them before you can dropdb). But you could add a "rollback prepared" to the driver script to clean out any uncommitted prepared xact. Whether this is workable or not depends on the size of the set of "expected" downstream-failure diffs. My gut feeling from many years of watching regression test crashes is that it'd be large but not completely impractical to look through by hand. I haven't time to write something like that myself, but offhand it seems like it could be done without more than a day or so's work, especially if you start from the buildfarm infrastructure. BTW, don't forget to include autovac workers in the set of SIGTERM target candidates. regards, tom lane
В списке pgsql-hackers по дате отправления: