[MASSMAIL]To what extent should tests rely on VACUUM ANALYZE?
От | Alexander Lakhin |
---|---|
Тема | [MASSMAIL]To what extent should tests rely on VACUUM ANALYZE? |
Дата | |
Msg-id | 66eb9a6e-fc67-a230-c5b1-2a741e8b88c6@gmail.com обсуждение исходный текст |
Ответы |
Re: To what extent should tests rely on VACUUM ANALYZE?
Re: To what extent should tests rely on VACUUM ANALYZE? |
Список | pgsql-hackers |
Hello hackers, When running multiple 027_stream_regress.pl test instances in parallel (and with aggressive autovacuum) on a rather slow machine, I encountered test failures due to the subselect test instability just as the following failures on buildfarm: 1) https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=grassquit&dt=2024-03-27%2010%3A16%3A12 --- /home/bf/bf-build/grassquit/HEAD/pgsql/src/test/regress/expected/subselect.out 2024-03-19 22:20:34.435867114 +0000 +++ /home/bf/bf-build/grassquit/HEAD/pgsql.build/testrun/recovery/027_stream_regress/data/results/subselect.out 2024-03-27 10:28:38.185776605 +0000 @@ -2067,16 +2067,16 @@ QUERY PLAN ------------------------------------------------- Hash Join - Hash Cond: (c.odd = b.odd) + Hash Cond: (c.hundred = a.hundred) -> Hash Join - Hash Cond: (a.hundred = c.hundred) - -> Seq Scan on tenk1 a + Hash Cond: (b.odd = c.odd) + -> Seq Scan on tenk2 b -> Hash -> HashAggregate Group Key: c.odd, c.hundred -> Seq Scan on tenk2 c -> Hash - -> Seq Scan on tenk2 b + -> Seq Scan on tenk1 a (11 rows) 2) https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mylodon&dt=2024-03-27%2009%3A49%3A38 (That query was added recently (by 9f1337639 from 2023-02-15) and the failure evidentially depends on timing, so the number of the failures I could find on buildfarm is moderate for now.) With the subselect test modified as in attached, I could see what makes the plan change: - -> Seq Scan on public.tenk2 c (cost=0.00..445.00 rows=10000 width=8) + -> Seq Scan on public.tenk2 c (cost=0.00..444.95 rows=9995 width=8) relname | relpages | reltuples | autovacuum_count | autoanalyze_count ---------+----------+-----------+------------------+------------------- - tenk2 | 345 | 10000 | 0 | 0 + tenk2 | 345 | 9995 | 0 | 0 Using the trick Thomas proposed in [1] (see my modification attached), I could reproduce the failure easily on my workstation with no specific conditions: 2024-03-28 14:05:13.792 UTC client backend[2358012] pg_regress/test_setup LOG: !!!ConditionalLockBufferForCleanup() returning false 2024-03-28 14:05:13.792 UTC client backend[2358012] pg_regress/test_setup CONTEXT: while scanning block 29 of relation "public.tenk2" 2024-03-28 14:05:13.792 UTC client backend[2358012] pg_regress/test_setup STATEMENT: VACUUM ANALYZE tenk2; ... relname | relpages | reltuples | autovacuum_count | autoanalyze_count ---------+----------+-----------+------------------+------------------- - tenk2 | 345 | 10000 | 0 | 0 + tenk2 | 345 | 9996 | 0 | 0 (1 row) So it looks to me like a possible cause of the failure, and I wonder whether checks for query plans should be immune to such changes or results of VACUUM ANALYZE should be 100% stable? [1] https://www.postgresql.org/message-id/CA%2BhUKGKYNHmL_DhmVRiidHv6YLAL8jViifwwn2ABY__Y3BCphg%40mail.gmail.com Best regards, Alexander
Вложения
В списке pgsql-hackers по дате отправления: