Re: Recent 027_streaming_regress.pl hangs
От | Alexander Lakhin |
---|---|
Тема | Re: Recent 027_streaming_regress.pl hangs |
Дата | |
Msg-id | 7e914807-eeaa-5c44-3543-bb2888030122@gmail.com обсуждение исходный текст |
Ответ на | Re: Recent 027_streaming_regress.pl hangs (Andres Freund <andres@anarazel.de>) |
Ответы |
Re: Recent 027_streaming_regress.pl hangs
(Andres Freund <andres@anarazel.de>)
|
Список | pgsql-hackers |
Hello Andres, 26.03.2024 10:59, Andres Freund wrote: > Late, will try to look more in the next few days. > AFAICS, last 027_streaming_regress.pl failures on calliphoridae, culicidae, tamandua occurred before 2024-03-27: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=calliphoridae&dt=2024-03-26%2004%3A07%3A30 https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae&dt=2024-03-22%2013%3A26%3A21 https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=tamandua&dt=2024-03-24%2007%3A44%3A27 So it looks like the issue resolved, but there is another apparently performance-related issue: deadlock-parallel test failures. A recent one: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=canebrake&dt=2024-04-02%2022%3A20%3A22 test deadlock-parallel ... FAILED 345099 ms +isolationtester: canceling step d2a1 after 300 seconds step d2a1: <... completed> - sum ------ -10000 -(1 row) - ... The server log shows: 2024-04-02 23:56:45.353 UTC [3583878][client backend][5/530:0] LOG: statement: SET force_parallel_mode = on; ... SELECT lock_share(3,x) FROM bigt LIMIT 1; 2024-04-02 23:56:45.364 UTC [3583876][client backend][3/2732:0] LOG: execute isolationtester_waiting: SELECT pg_catalog.pg_isolation_test_session_is_blocked($1, '{3583877,3583878,3583879,3583880}') 2024-04-02 23:56:45.364 UTC [3583876][client backend][3/2732:0] DETAIL: parameters: $1 = '3583878' ... 2024-04-02 23:57:28.967 UTC [3583876][client backend][3/5097:0] LOG: execute isolationtester_waiting: SELECT pg_catalog.pg_isolation_test_session_is_blocked($1, '{3583877,3583878,3583879,3583880}') 2024-04-02 23:57:28.967 UTC [3583876][client backend][3/5097:0] DETAIL: parameters: $1 = '3583877' 2024-04-02 23:57:29.016 UTC [3583877][client backend][4/530:0] LOG: statement: COMMIT; 2024-04-02 23:57:29.039 UTC [3583876][client backend][3/5098:0] LOG: execute isolationtester_waiting: SELECT pg_catalog.pg_isolation_test_session_is_blocked($1, '{3583877,3583878,3583879,3583880}') 2024-04-02 23:57:29.039 UTC [3583876][client backend][3/5098:0] DETAIL: parameters: $1 = '3583879' ... 2024-04-03 00:02:29.096 UTC [3583876][client backend][3/9472:0] LOG: execute isolationtester_waiting: SELECT pg_catalog.pg_isolation_test_session_is_blocked($1, '{3583877,3583878,3583879,3583880}') 2024-04-03 00:02:29.096 UTC [3583876][client backend][3/9472:0] DETAIL: parameters: $1 = '3583878' 2024-04-03 00:02:29.172 UTC [3905345][not initialized][:0] LOG: connection received: host=[local] 2024-04-03 00:02:29.240 UTC [3583878][client backend][5/530:0] ERROR: canceling statement due to user request The last step duration is 00:02:29.096 - 23:57:29.039 ~ 300 seconds (default max_step_wait for REL_15_STABLE- (for REL_16_STABLE+ the default value was increased to 360 by c99c67fc4)). The average deadlock-parallel duration for REL_15_STABLE on canebrake is around 128 seconds (for 140 runs I analyzed), but we can find also: test deadlock-parallel ... ok 377895 ms https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=canebrake&dt=2024-03-27%2001%3A06%3A24&stg=isolation-check test deadlock-parallel ... ok 302549 ms https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=canebrake&dt=2023-11-06%2012%3A47%3A01&stg=isolation-check test deadlock-parallel ... ok 255045 ms https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=canebrake&dt=2023-11-09%2010%3A02%3A59&stg=isolation-check The similar situation on phycodurus: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=phycodurus&dt=2024-02-11%2021:05:41 test deadlock-parallel ... FAILED 389381 ms The average deadlock-parallel duration for REL_13_STABLE on phycodurus is around 78 seconds (for 138 recent runs), but there were also: test deadlock-parallel ... ok 441736 ms https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=phycodurus&dt=2024-03-04%2015%3A19%3A04&stg=isolation-check test deadlock-parallel ... ok 187844 ms https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=phycodurus&dt=2023-11-03%2016%3A13%3A46&stg=isolation-check And also pogona, REL_14_STABLE: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=pogona&dt=2024-02-20%2003%3A50%3A49 test deadlock-parallel ... FAILED 425482 ms (I could reach similar duration on a slowed-down VM, with JIT enabled as on these animals.) So, maybe these machines require larger PGISOLATIONTIMEOUT or there is still some OS/environment issue there? Best regards, Alexander
В списке pgsql-hackers по дате отправления: