BUG #10142: Downstream standby indefinitely waits for an old WAL log in new timeline on WAL Cascading replicatio

Поиск

Список

Период

Сортировка

От	skeefe@rdx.com
Тема	BUG #10142: Downstream standby indefinitely waits for an old WAL log in new timeline on WAL Cascading replicatio
Дата	26 апреля 2014 г. 03:10:57
Msg-id	20140425174336.2721.61539@wrigleys.postgresql.org обсуждение исходный текст
Ответы	Re: BUG #10142: Downstream standby indefinitely waits for an old WAL log in new timeline on WAL Cascading replicatio
Список	pgsql-bugs

Дерево обсуждения

The following bug has been logged on the website:

Bug reference:      10142
Logged by:          Sean Keefe
Email address:      skeefe@rdx.com
PostgreSQL version: 9.2.8
Operating system:   Redhat 6.4
Description:

The issues that we are experiencing is with Postgres 9.2.8 Cascading WAL
Replication. If the master goes down during a massive transaction and we
promote the first slave then next slave looks for a WAL log that never
existed, New timeline before the split of timelines. Below is how to re
create the issue:

1.    Create M using postgresql.conf_M. Start M.
CREATE TABLE t_test (id int4);

2.    Create S1 from M using postgresql.conf_S1 and recovery.conf_S1 (I used
rsync). Start S1

3.    Create S2 from M using postgresql.conf_S2 and recovery.conf_S2 (I used
rsync). Start S2

4.    Insert data in t_test table in M
INSERT INTO t_test SELECT * FROM generate_series(1, 250000) ;
5.    Important: Do not shutdown M. If you want you can crash M by killing
pids. I just let it run and immediately proceeded to next step. The idea
here is to promote S1 before M transmits the last WAL which has the COMMIT
of the above INSERT.

6.    Promote S1. S1 will change its timeline.

7.    S2 will not recognize the new timeline of its master S1. PGSTOP S2 and
then PGSTART. S2 will now change its timeline. However, as you see in the
pg_log, it will wait for a WAL that will never arrive. It will look for WALs
from previous timeline in new timeline file naming format. E.g it will wait
for 0000000A00000026000000F1. You will see that such log exists in the name
0000000900000026000000F1. So it will wait forever and if you try to connect
to S2 you will see error âFATAL:  the database system is starting upâ

Recovery.conf for S1:
restore_command = '/data/postgres/rep_poc/restore_command.sh %f %p %r'
recovery_end_command = 'rm -f /data/postgres/rep_poc/trigger.cfg'

recovery_target_timeline = 'latest'

recovery.conf for S2:
restore_command = '/data/postgres/rep_poc/restore_command.sh %f %p %r'
recovery_end_command = 'rm -f /data/postgres/rep_poc/trigger.cfg'

recovery_target_timeline = 'latest'

If you need any of the other configuration files let me know and i can send
them to you.

В списке pgsql-bugs по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

BUG #10142: Downstream standby indefinitely waits for an old WAL log in new timeline on WAL Cascading replicatio