Обсуждение: reducing isolation tests runtime
On the subject of test total time, we could paralelize isolation tests. Right now "make check" in src/test/isolation takes 1:16 on my machine. Test "timeouts" takes full 40s of that, with nothing running in parallel -- the machine is completely idle. Seems like we can have a lot of time back just by changing the schedule to use multiple tests per line (in particular, put the other slow tests together with timeouts), per the attached; with this new schedule, isolation takes 44 seconds in my machine -- a win of 32 seconds. We can win a couple of additional second by grouping a few other lines, but this is the biggest win. (This needs to be adjusted because some table names in the specs conflict.) -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Вложения
On Wed, Jan 24, 2018 at 6:10 PM, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > On the subject of test total time, we could paralelize isolation tests. > Right now "make check" in src/test/isolation takes 1:16 on my machine. > Test "timeouts" takes full 40s of that, with nothing running in parallel > -- the machine is completely idle. > > Seems like we can have a lot of time back just by changing the schedule > to use multiple tests per line (in particular, put the other slow tests > together with timeouts), per the attached; with this new schedule, > isolation takes 44 seconds in my machine -- a win of 32 seconds. We can > win a couple of additional second by grouping a few other lines, but > this is the biggest win. > > (This needs to be adjusted because some table names in the specs > conflict.) Oh, cool. Yes, the time the isolation tests take to run is quite annoying. I didn't realize it would be so easy to run it in parallel. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Wed, Jan 24, 2018 at 6:10 PM, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: >> On the subject of test total time, we could paralelize isolation tests. > Oh, cool. Yes, the time the isolation tests take to run is quite > annoying. I didn't realize it would be so easy to run it in parallel. +1 to both --- I hadn't realized we had enough infrastructure to do this in parallel, either. regards, tom lane
Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > On the subject of test total time, we could paralelize isolation tests. > Right now "make check" in src/test/isolation takes 1:16 on my machine. > Test "timeouts" takes full 40s of that, with nothing running in parallel > -- the machine is completely idle. BTW, one small issue there is that the reason the timeouts test is so slow is that we have to use multi-second timeouts to be sure slower buildfarm critters (eg valgrind animals) will get the expected results. So I'm worried that if the machine isn't otherwise idle, we will get random failures. We could parallelize the rest of those tests and leave timeouts in its own group. That cuts the payback a lot :-( but might still be worth doing. Or maybe tweak things so that the buildfarm runs a serial schedule but manual testing doesn't. Or we could debate how important the timeout tests really are ... or think harder about how to make them reproducible. regards, tom lane
Tom Lane wrote: > Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > > On the subject of test total time, we could paralelize isolation tests. > > Right now "make check" in src/test/isolation takes 1:16 on my machine. > > Test "timeouts" takes full 40s of that, with nothing running in parallel > > -- the machine is completely idle. > > BTW, one small issue there is that the reason the timeouts test is so > slow is that we have to use multi-second timeouts to be sure slower > buildfarm critters (eg valgrind animals) will get the expected results. > So I'm worried that if the machine isn't otherwise idle, we will get > random failures. I think we could solve this by putting in the same parallel group only slow tests that mostly sleeps, ie. nothing that would monopolize CPU for long enough to cause a problem. Concretely: test: timeouts tuplelock-update deadlock-hard deadlock-soft-2 all of these tests have lots of sleeps and don't go through a lot of data. (Compared to the previous patch, I removed alter-table-1, which uses a thousand tuples, and multiple-row-versions, which uses 100k; also removed receipt-report which uses a large number of permutations.) Timings: timeouts 40.3s tuplelock-update 10.5s deadlock-hard 10.9s deadlock-soft-2 5.4s alter-table-1 takes 1.5s, receipt-report 1.2s and there's nothing else that takes above 1s, so I think this is good enough -- we can still have the whole thing run in ~45 seconds without the hazard you describe. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > Tom Lane wrote: >> Alvaro Herrera <alvherre@alvh.no-ip.org> writes: >>> On the subject of test total time, we could paralelize isolation tests. >> BTW, one small issue there is that the reason the timeouts test is so >> slow is that we have to use multi-second timeouts to be sure slower >> buildfarm critters (eg valgrind animals) will get the expected results. >> So I'm worried that if the machine isn't otherwise idle, we will get >> random failures. > I think we could solve this by putting in the same parallel group only > slow tests that mostly sleeps, ie. nothing that would monopolize CPU for > long enough to cause a problem. Concretely: > test: timeouts tuplelock-update deadlock-hard deadlock-soft-2 OK, but there'd better be a comment there explaining the concern very precisely, or somebody will break it. regards, tom lane
Tom Lane wrote: > Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > > I think we could solve this by putting in the same parallel group only > > slow tests that mostly sleeps, ie. nothing that would monopolize CPU for > > long enough to cause a problem. Concretely: > > test: timeouts tuplelock-update deadlock-hard deadlock-soft-2 > > OK, but there'd better be a comment there explaining the concern > very precisely, or somebody will break it. Here's a concrete proposal. Runtime is 45.7 seconds on my laptop. It can be further reduced, but not by more than a second or two unless you get in the business of modifying other tests. (I only modified deadlock-soft-2 because it saves 5 seconds). Admittedly the new isolation_schedule file is a bit ugly. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Вложения
Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > Here's a concrete proposal. Runtime is 45.7 seconds on my laptop. It > can be further reduced, but not by more than a second or two unless you > get in the business of modifying other tests. (I only modified > deadlock-soft-2 because it saves 5 seconds). Looks reasonable to me, but do we want to set any particular convention about the max number of tests to run in parallel? If so there should be at least a comment saying what. > Admittedly the new isolation_schedule file is a bit ugly. Meh, seems fine. We won't know if this really works till it hits the buildfarm, I suppose. regards, tom lane
Tom Lane wrote: > Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > > Here's a concrete proposal. Runtime is 45.7 seconds on my laptop. It > > can be further reduced, but not by more than a second or two unless you > > get in the business of modifying other tests. (I only modified > > deadlock-soft-2 because it saves 5 seconds). > > Looks reasonable to me, but do we want to set any particular convention > about the max number of tests to run in parallel? If so there should > be at least a comment saying what. Hmm, I ran this in a limited number of connections and found that it fails with less than 27; and there's no MAX_CONNECTIONS like there is for pg_regress. So I'll put this back on the drawing board until I'm back from vacations ... -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi, On 2018-01-25 18:27:28 -0300, Alvaro Herrera wrote: > Tom Lane wrote: > > Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > > > Here's a concrete proposal. Runtime is 45.7 seconds on my laptop. It > > > can be further reduced, but not by more than a second or two unless you > > > get in the business of modifying other tests. (I only modified > > > deadlock-soft-2 because it saves 5 seconds). > > > > Looks reasonable to me, but do we want to set any particular convention > > about the max number of tests to run in parallel? If so there should > > be at least a comment saying what. > > Hmm, I ran this in a limited number of connections and found that it > fails with less than 27; and there's no MAX_CONNECTIONS like there is > for pg_regress. So I'll put this back on the drawing board until I'm > back from vacations ... I'd like to see this revived, getting a bit tired waiting longer and longer to see isolationtester complete. Is it really a problem that we require a certain number of connections? Something on the order of 30-50 connections ought not to be a real problem for realistic machines, and if it's a problem for one, they can use a serialized schedule? Greetings, Andres Freund
On 2018-Dec-04, Andres Freund wrote: > Hi, > > I'd like to see this revived, getting a bit tired waiting longer and > longer to see isolationtester complete. Is it really a problem that we > require a certain number of connections? Something on the order of 30-50 > connections ought not to be a real problem for realistic machines, and > if it's a problem for one, they can use a serialized schedule? Hello Yeah, me too. Let me see about it. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Andres Freund <andres@anarazel.de> writes: > I'd like to see this revived, getting a bit tired waiting longer and > longer to see isolationtester complete. Is it really a problem that we > require a certain number of connections? Something on the order of 30-50 > connections ought not to be a real problem for realistic machines, and > if it's a problem for one, they can use a serialized schedule? The longstanding convention in the main regression tests is 20 max. Is there a reason to be different here? regards, tom lane
Hi, On 2018-12-04 15:45:39 -0500, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > I'd like to see this revived, getting a bit tired waiting longer and > > longer to see isolationtester complete. Is it really a problem that we > > require a certain number of connections? Something on the order of 30-50 > > connections ought not to be a real problem for realistic machines, and > > if it's a problem for one, they can use a serialized schedule? > > The longstanding convention in the main regression tests is 20 max. > Is there a reason to be different here? It's a bit less obvious from the outside how many connections a test spawn - IOW, it might be easier to maintain the schedule if the cap isn't as tight. And I'm doubtful that there's a good reason for the 20 limit these days, so going a bit higher seems reasonable. Greetings, Andres Freund
Hi, On 2018-01-25 17:34:15 -0300, Alvaro Herrera wrote: > Tom Lane wrote: > > Alvaro Herrera <alvherre@alvh.no-ip.org> writes: > > > > I think we could solve this by putting in the same parallel group only > > > slow tests that mostly sleeps, ie. nothing that would monopolize CPU for > > > long enough to cause a problem. Concretely: > > > test: timeouts tuplelock-update deadlock-hard deadlock-soft-2 > > > > OK, but there'd better be a comment there explaining the concern > > very precisely, or somebody will break it. > > Here's a concrete proposal. Runtime is 45.7 seconds on my laptop. It > can be further reduced, but not by more than a second or two unless you > get in the business of modifying other tests. (I only modified > deadlock-soft-2 because it saves 5 seconds). I'm working an updated version of this. Adding the new tests is a bit painful because of conflicting names making it harder than necessary to schedule tests. While it's possible to work out a schedule that doesn't conflict, it's pretty annoying to do and more importantly seems fragile - it's very easy to create schedules that succeed on one machine, and not on another, based on how slow which tests are. I'm more inclined to be a bit more aggressive in renaming tables - there's not much point in having a lot of "foo"s around. So I'm inclined to rename some of the names that are more likely to conflict. If we agree on doing that, I'd like to do that first, and commit that more aggressively than the schedule itself. An alternative approach would be to have isolationtester automatically create a schema with the specfile's name, and place it in the search path. But that'd make it impossible to use isolationtester against a standby - which I think we currently don't do, but which probably would be a good idea. With regard to the schedule, I'm inclined to order it so that faster test groups are earlier on, just to make it more likely to reach the tests one is debugging faster. Does that sound sane? Do we want to maintain a serial version of the schedule too? I'm wondering if we should just generate both the isolationtester and plain regression test schedule by either adding an option to pg_regress that serializes test groups, or by generating the serial schedule file in a few lines of perl. Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > I'm working an updated version of this. Adding the new tests is a bit > painful because of conflicting names making it harder than necessary to > schedule tests. While it's possible to work out a schedule that doesn't > conflict, it's pretty annoying to do and more importantly seems fragile > - it's very easy to create schedules that succeed on one machine, and > not on another, based on how slow which tests are. > I'm more inclined to be a bit more aggressive in renaming tables - > there's not much point in having a lot of "foo"s around. So I'm > inclined to rename some of the names that are more likely to > conflict. If we agree on doing that, I'd like to do that first, and > commit that more aggressively than the schedule itself. +1 > Do we want to maintain a serial version of the schedule too? Some of the slower buildfarm critters use MAX_CONNECTIONS to limit the load on their hosts. As long as the isolation tests honor that, I don't see a real need for a separate serial schedule. (We've talked about retiring the serial sched for the main regression tests, and while that trigger's not been pulled yet, I think it's just a matter of time. So making the isolation tests follow that precedent seems wrong anyway.) regards, tom lane
On 2019-Feb-13, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > I'm working an updated version of this. Adding the new tests is a bit > > painful because of conflicting names making it harder than necessary to > > schedule tests. While it's possible to work out a schedule that doesn't > > conflict, it's pretty annoying to do and more importantly seems fragile > > - it's very easy to create schedules that succeed on one machine, and > > not on another, based on how slow which tests are. > > > I'm more inclined to be a bit more aggressive in renaming tables - > > there's not much point in having a lot of "foo"s around. So I'm > > inclined to rename some of the names that are more likely to > > conflict. If we agree on doing that, I'd like to do that first, and > > commit that more aggressively than the schedule itself. > > +1 +1 (Using separate schemas sounds a useful idea if we accumulate dozens of tests, so I suggest that we do that for future tests, but for the time being I wouldn't bother.) > > Do we want to maintain a serial version of the schedule too? > > Some of the slower buildfarm critters use MAX_CONNECTIONS to limit > the load on their hosts. As long as the isolation tests honor that, > I don't see a real need for a separate serial schedule. MAX_CONNECTIONS was the only reason I didn't push this through. Do you (Andres) have any solution to that? -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > On 2019-Feb-13, Tom Lane wrote: >> Some of the slower buildfarm critters use MAX_CONNECTIONS to limit >> the load on their hosts. As long as the isolation tests honor that, >> I don't see a real need for a separate serial schedule. > MAX_CONNECTIONS was the only reason I didn't push this through. Do you > (Andres) have any solution to that? Doesn't the common pg_regress.c infrastructure handle that? We might need to improve isolation_main.c and/or the isolation Makefile to make it accessible. I suppose that in what I'm thinking about, MAX_CONNECTIONS would be interpreted as "max number of concurrent isolation scripts", which is not exactly number of connections. A quick and dirty answer would be to have isolation_main.c divide the limit by a factor of 4 or so. regards, tom lane
On 2019-02-13 10:58:50 -0500, Tom Lane wrote: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: > > On 2019-Feb-13, Tom Lane wrote: > >> Some of the slower buildfarm critters use MAX_CONNECTIONS to limit > >> the load on their hosts. As long as the isolation tests honor that, > >> I don't see a real need for a separate serial schedule. > > > MAX_CONNECTIONS was the only reason I didn't push this through. Do you > > (Andres) have any solution to that? > > Doesn't the common pg_regress.c infrastructure handle that? > We might need to improve isolation_main.c and/or the isolation > Makefile to make it accessible. > I suppose that in what I'm thinking about, MAX_CONNECTIONS would be > interpreted as "max number of concurrent isolation scripts", which > is not exactly number of connections. A quick and dirty answer > would be to have isolation_main.c divide the limit by a factor of 4 > or so. I guess that could work, although it's certainly not too pretty. Alternatively we could pre-parse the spec files, but that's a bit annoying given isolationtester.c is a separate c file... Do you have an idea why we have both max_concurrent_tests *and* max_connections in pg_regress? ISTM the former isn't really useful given the latter? Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > Do you have an idea why we have both max_concurrent_tests *and* > max_connections in pg_regress? ISTM the former isn't really useful given > the latter? No, the former is a static restriction on what the schedule file is allowed to contain, the latter is a dynamic restriction (that typically is unlimited anyway). regards, tom lane
Hi, On 2019-02-13 12:41:41 -0500, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > Do you have an idea why we have both max_concurrent_tests *and* > > max_connections in pg_regress? ISTM the former isn't really useful given > > the latter? > > No, the former is a static restriction on what the schedule file is > allowed to contain, the latter is a dynamic restriction (that typically > is unlimited anyway). Right, but why don't we allow for more tests in a group, and then use a default max_connections to limit concurrency? Having larger groups is advantageous wrt test runtime - it reduces the number of artificial serialization point where the slowest test slows things down. Obviously there's still a few groups that are needed for test interdependency management, but that's comparatively rare. We have have plenty groups that are just broken up to stay below max_concurrent_tests. Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > Right, but why don't we allow for more tests in a group, and then use a > default max_connections to limit concurrency? Having larger groups is > advantageous wrt test runtime - it reduces the number of artificial > serialization point where the slowest test slows things down. Obviously > there's still a few groups that are needed for test interdependency > management, but that's comparatively rare. We have have plenty groups > that are just broken up to stay below max_concurrent_tests. Meh. That would also greatly increase the scope for hard-to-reproduce conflicts between concurrent tests. I'm not especially excited about going there. regards, tom lane