Timeout control within tests
От | Noah Misch |
---|---|
Тема | Timeout control within tests |
Дата | |
Msg-id | 20220218052842.GA3627003@rfd.leadboat.com обсуждение исходный текст |
Ответы |
Re: Timeout control within tests
|
Список | pgsql-hackers |
On Fri, Feb 05, 2021 at 03:55:20PM -0500, Tom Lane wrote: > We have, almost invariably, regretted it when we tried to use short > timeouts in test cases. > More generally, sometimes people want to do things like run a test > under valgrind. So it's not just "underpowered machines" that may > need a generous timeout. Even if we did reduce the default, I'd > want a way (probably via an environment variable, cf PGCTLTIMEOUT) > to kick it back up. I have a few use cases for that: 1. My buildfarm members have more and more competition for CPU and I/O. Examples where I suspect animal slowness caused a 180s timeout: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=topminnow&dt=2021-04-11%2003%3A11%3A39 https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=tern&dt=2021-07-26%2004%3A38%3A29 2. When I'm developing a change and I locally break a test in a way that leads to a timeout, I like to be able to lower that timeout. 3. I want more tests to use the right timeout from the start. Low-timeout tests appear at least a few times per year: d03eeab Mon May 31 00:29:58 2021 -0700 Raise a timeout to 180s, in test 010_logical_decoding_timelines.pl. 388b959 Sat Feb 27 07:02:56 2021 -0800 Raise a timeout to 180s, in contrib/test_decoding. 08dde1b Tue Dec 22 11:10:12 2020 -0500 Increase timeout in 021_row_visibility.pl. 8961355 Sat Apr 25 18:45:27 2020 -0700 Raise a timeout to 180s, in test 003_recovery_targets.pl. 1db439a Mon Dec 10 20:15:42 2018 -0800 Raise some timeouts to 180s, in test code. I propose to have environment variable PG_TEST_TIMEOUT_DEFAULT control the timeout used in the places that currently hard-code 180s. TAP tests should retrieve the value via $PostgreSQL::Test::Utils::timeout_default. pg_regress tests should retrieve it via \getenv. I would like to back-patch the TAP part, for cause (1). (The pg_regress part hasn't been a buildfarm problem, and \getenv is new in v15.) Patches attached. I considered and excluded other changes, for now: a. I considered consolidating this with PGISOLATIONTIMEOUT (default 300). One could remove the older variable entirely or make isolationtester use the first-available of [PGISOLATIONTIMEOUT, 2 * PG_TEST_TIMEOUT_DEFAULT, 360]. Does anyone have an opinion on what, if anything, to do there? b. I briefly made stats.sql accept PG_TEST_TIMEOUT_DEFAULT to override its hard-coded 30s timeout. However, a higher timeout won't help when a UDP buffer fills. If the test were structured to observe evidence of a vacant UDP buffer before proceeding with the test stat messages, a higher timeout could make more sense. I added a comment. c. One could remove timeout-duration function arguments (e.g. from pg_recvlogical_upto) and just have the function consult timeout_default. This felt like highly-optional refactoring.
Вложения
В списке pgsql-hackers по дате отправления: