Fixing WAL instability in various TAP tests
От | Mark Dilger |
---|---|
Тема | Fixing WAL instability in various TAP tests |
Дата | |
Msg-id | 32A1FDD1-9C7B-43B1-B3EE-49198DD3F887@enterprisedb.com обсуждение исходный текст |
Ответы |
Re: Fixing WAL instability in various TAP tests
|
Список | pgsql-hackers |
Hackers, A few TAP tests in the project appear to be sensitive to reductions of the PostgresNode's max_wal_size setting, resultingin tests failing due to wal files having been removed too soon. The failures in the logs typically are of the "requestedWAL segment %s has already been removed" variety. I would expect tests which fail under legal alternate GUC settingsto be hardened to explicitly set the GUCs as they need, rather than implicitly relying on the defaults. As far asmissing WAL files go, I would expect the TAP test to prevent this with the use of replication slots or some other mechanism,and not simply to rely on checkpoints not happening too soon. I'm curious if others on this list disagree withthat point of view. Failures in src/test/recovery/t/015_promotion_pages.pl can be fixed by creating a physical replication slot on node "alpha"and using it from node "beta", a technique already used in other TAP tests and apparently merely overlooked in thisone. The first two tests in src/bin/pg_basebackup/t fail, and it's not clear that physical replication slots are the appropriatesolution, since no replication is happening. It's not immediately obvious that the tests are at fault anyway. On casual inspection, it seems they might be detecting a live bug which simply doesn't manifest under larger valuesof max_wal_size. Test 010 appears to show a bug with `pg_basebackup -X`, and test 020 with `pg_receivewal`. The test in contrib/bloom/t/ is deliberately disabled in contrib/bloom/Makefile with a comment that the test is unstablein the buildfarm, but I didn't find anything to explain what exactly those buildfarm failures might have been whenI chased down the email thread that gave rise to the related commit. That test happens to be stable on my laptop untilI change GUC settings to both reduce max_wal_size=32MB and to set wal_consistency_checking=all. Thoughts? — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления: