Re: Loss of replication after simple misconfiguration

Поиск

Список

Период

Сортировка

От	Andrew Gierth
Тема	Re: Loss of replication after simple misconfiguration
Дата	9 апреля 2020 г. 16:19:09
Msg-id	878sj4skmj.fsf@news-spur.riddles.org.uk обсуждение исходный текст
Ответ на	Loss of replication after simple misconfiguration (hubert depesz lubaczewski <depesz@depesz.com>)
Ответы	Re: Loss of replication after simple misconfiguration
Список	pgsql-bugs

Дерево обсуждения

>>>>> "hubert" == hubert depesz lubaczewski <depesz@depesz.com> writes:

 hubert> PostgreSQL 9.5.15 on Ubuntu bionic.
 [...]
 hubert> tried to restart only to be greeted by:
 hubert> 2020-04-07T15:13:49.729943+00:00 postgres[20491]: [7-1] db=,user= LOG:  restored log file
"000000030001779200000061"from archive
 
 hubert> 2020-04-07T15:13:49.757222+00:00 postgres[20491]: [8-1] db=,user= FATAL:  could not access status of
transaction4275781146
 
 hubert> 2020-04-07T15:13:49.757314+00:00 postgres[20491]: [8-2] db=,user= DETAIL:  Could not read from file
"pg_commit_ts/27D4B"at offset 245760: Success.
 
 hubert> 2020-04-07T15:13:49.757380+00:00 postgres[20491]: [8-3] db=,user= CONTEXT:  xlog redo Transaction/COMMIT:
2020-04-0702:40:10.065859+00
 
 hubert> 2020-04-07T15:13:49.761239+00:00 postgres[20487]: [2-1] db=,user= LOG:  startup process (PID 20491) exited
withexit code 1
 
 hubert> 2020-04-07T15:13:49.761387+00:00 postgres[20487]: [3-1] db=,user= LOG:  terminating any other active server
processes

So I've been assisting hubert with analysis of this on IRC, and what we
have found so far suggests:

1. the max_worker_processes thing is a red herring

2. It is virtually certain that the restart, in addition to changing
max_worker_processes on the master, also changed the master's setting of
track_commit_timestamp from off to on (which is clearly relevant to the
issue)

(We established #2 from the fact that we _do_ have the WAL files from
the failed recovery, and they don't contain any COMMIT_TS_ZEROPAGE
records despite covering many thousands of transactions.)

I've suggested trying to reproduce the issue by changing this parameter
across a crash.

I did notice that 9.5.15 does have a fix for an issue in this area, but
I didn't see any more recent changes - did I miss anything?

-- 
Andrew (irc:RhodiumToad)

В списке pgsql-bugs по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Loss of replication after simple misconfiguration