>>>>> "hubert" == hubert depesz lubaczewski <depesz@depesz.com> writes:
hubert> PostgreSQL 9.5.15 on Ubuntu bionic.
[...]
hubert> tried to restart only to be greeted by:
hubert> 2020-04-07T15:13:49.729943+00:00 postgres[20491]: [7-1] db=,user= LOG: restored log file
"000000030001779200000061"from archive
hubert> 2020-04-07T15:13:49.757222+00:00 postgres[20491]: [8-1] db=,user= FATAL: could not access status of
transaction4275781146
hubert> 2020-04-07T15:13:49.757314+00:00 postgres[20491]: [8-2] db=,user= DETAIL: Could not read from file
"pg_commit_ts/27D4B"at offset 245760: Success.
hubert> 2020-04-07T15:13:49.757380+00:00 postgres[20491]: [8-3] db=,user= CONTEXT: xlog redo Transaction/COMMIT:
2020-04-0702:40:10.065859+00
hubert> 2020-04-07T15:13:49.761239+00:00 postgres[20487]: [2-1] db=,user= LOG: startup process (PID 20491) exited
withexit code 1
hubert> 2020-04-07T15:13:49.761387+00:00 postgres[20487]: [3-1] db=,user= LOG: terminating any other active server
processes
So I've been assisting hubert with analysis of this on IRC, and what we
have found so far suggests:
1. the max_worker_processes thing is a red herring
2. It is virtually certain that the restart, in addition to changing
max_worker_processes on the master, also changed the master's setting of
track_commit_timestamp from off to on (which is clearly relevant to the
issue)
(We established #2 from the fact that we _do_ have the WAL files from
the failed recovery, and they don't contain any COMMIT_TS_ZEROPAGE
records despite covering many thousands of transactions.)
I've suggested trying to reproduce the issue by changing this parameter
across a crash.
I did notice that 9.5.15 does have a fix for an issue in this area, but
I didn't see any more recent changes - did I miss anything?
--
Andrew (irc:RhodiumToad)