Обсуждение: coverage increase for worker_spi
Tom pointed out that coverage for worker_spi is 0%. For a module that only exists to provide coverage, that's pretty stupid. This patch increases coverage to 90.9% line-wise and 100% function-wise, which seems like a sufficient starting point. How would people feel about me getting this in master at this point in the cycle, it being just some test code? We can easily revert if it seems too unstable. -- Álvaro Herrera
Вложения
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Tom pointed out that coverage for worker_spi is 0%. For a module that > only exists to provide coverage, that's pretty stupid. This patch > increases coverage to 90.9% line-wise and 100% function-wise, which > seems like a sufficient starting point. > How would people feel about me getting this in master at this point in > the cycle, it being just some test code? We can easily revert if > it seems too unstable. I'm not opposed to adding a new test case at this point in the cycle, but as written this one seems more or less guaranteed to fail under load. You can't just sleep for worker_spi.naptime and expect that the worker will certainly have run. Perhaps you could use a plpgsql DO block with a loop to wait up to X seconds until the expected state appears, for X around 120 to 180 seconds (compare poll_query_until in the TAP tests). regards, tom lane
On 2019-May-29, Tom Lane wrote: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: > > Tom pointed out that coverage for worker_spi is 0%. For a module that > > only exists to provide coverage, that's pretty stupid. This patch > > increases coverage to 90.9% line-wise and 100% function-wise, which > > seems like a sufficient starting point. > > > How would people feel about me getting this in master at this point in > > the cycle, it being just some test code? We can easily revert if > > it seems too unstable. > > I'm not opposed to adding a new test case at this point in the cycle, > but as written this one seems more or less guaranteed to fail under > load. True. Here's a version that should be more resilient. One thing I noticed while writing it, though, is that worker_spi uses the postgres database, instead of the contrib_regression database that was created for it. And we create a schema and a table there. This is going to get some eyebrows raised, I think, so I'll look into fixing that as a bugfix before getting this commit in. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Вложения
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > On 2019-May-29, Tom Lane wrote: >> I'm not opposed to adding a new test case at this point in the cycle, >> but as written this one seems more or less guaranteed to fail under >> load. > True. Here's a version that should be more resilient. Hm, I don't understand how this works at all: + PERFORM pg_sleep(CASE WHEN count(*) = 0 THEN 0 ELSE 0.1 END) + FROM schema1.counted WHERE type = 'delta'; + GET DIAGNOSTICS count = ROW_COUNT; Given that it uses an aggregate, the ROW_COUNT must always be 1, no? regards, tom lane
On 2019-May-30, Tom Lane wrote: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: > > On 2019-May-29, Tom Lane wrote: > >> I'm not opposed to adding a new test case at this point in the cycle, > >> but as written this one seems more or less guaranteed to fail under > >> load. > > > True. Here's a version that should be more resilient. > > Hm, I don't understand how this works at all: > > + PERFORM pg_sleep(CASE WHEN count(*) = 0 THEN 0 ELSE 0.1 END) > + FROM schema1.counted WHERE type = 'delta'; > + GET DIAGNOSTICS count = ROW_COUNT; > > Given that it uses an aggregate, the ROW_COUNT must always be 1, no? Well, I was surprised to see the count(*) work there as an argument for pg_sleep there at all frankly (maybe we are sleeping 0.1s more than we really need, per your observation), but the row_count is concerned with rows that have type = 'delta', which are deleted by the bgworker. So the test script job is done when the bgworker has run once through its loop. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > On 2019-May-30, Tom Lane wrote: >> Hm, I don't understand how this works at all: >> >> + PERFORM pg_sleep(CASE WHEN count(*) = 0 THEN 0 ELSE 0.1 END) >> + FROM schema1.counted WHERE type = 'delta'; >> + GET DIAGNOSTICS count = ROW_COUNT; >> >> Given that it uses an aggregate, the ROW_COUNT must always be 1, no? > Well, I was surprised to see the count(*) work there as an argument for > pg_sleep there at all frankly (maybe we are sleeping 0.1s more than we > really need, per your observation), but the row_count is concerned with > rows that have type = 'delta', which are deleted by the bgworker. So > the test script job is done when the bgworker has run once through its > loop. No, the row_count is going to report the number of rows returned by the aggregate query, which is going to be one row, independently of how many rows went into the aggregate. regression=# do $$ declare c int; begin perform count(*) from tenk1; get diagnostics c = row_count; raise notice 'c = %', c; end$$; psql: NOTICE: c = 1 DO regression=# do $$ declare c int; begin perform count(*) from tenk1 where false; get diagnostics c = row_count; raise notice 'c = %', c; end$$; psql: NOTICE: c = 1 DO I think you want to capture the actual aggregate output rather than relying on row_count: regression=# do $$ declare c int; begin c := count(*) from tenk1; raise notice 'c = %', c; end$$; psql: NOTICE: c = 10000 DO regression=# do $$ declare c int; begin c := count(*) from tenk1 where false; raise notice 'c = %', c; end$$; psql: NOTICE: c = 0 DO regards, tom lane
On 2019-May-30, Alvaro Herrera wrote: > One thing I noticed while writing it, though, is that worker_spi uses > the postgres database, instead of the contrib_regression database that > was created for it. And we create a schema and a table there. This is > going to get some eyebrows raised, I think, so I'll look into fixing > that as a bugfix before getting this commit in. Another thing I noticed when fixing *this*, in turn, is that if you load worker_spi in shared_preload_libraries then the contrib_regression database doesn't exist by the point that runs, so those workers fail to start. The dynamic one does start in the configured database. I guess we could just ignore the failures and just rely on the dynamic worker. I ended up with these two patches. I'm not sure about pushing separately. It seems pointless to backport the "fix" to back branches anyway. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Вложения
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > I ended up with these two patches. I'm not sure about pushing > separately. It seems pointless to backport the "fix" to back branches > anyway. Patch passes the eyeball test, though I did not try to run it. I concur with squashing into one commit and applying to HEAD only. regards, tom lane
On 2019-Jun-01, Tom Lane wrote: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: > > I ended up with these two patches. I'm not sure about pushing > > separately. It seems pointless to backport the "fix" to back branches > > anyway. > > Patch passes the eyeball test, though I did not try to run it. > I concur with squashing into one commit and applying to HEAD only. Okay, pushed. Let's see how it does, now. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services