Cascading replication and recovery_target_timeline='latest'
От | Heikki Linnakangas |
---|---|
Тема | Cascading replication and recovery_target_timeline='latest' |
Дата | |
Msg-id | 50406FD6.8050903@iki.fi обсуждение исходный текст |
Ответы |
Re: Cascading replication and recovery_target_timeline='latest'
|
Список | pgsql-hackers |
When a cascading standby launches a new walsender, it fetches the current recovery timeline: /* * Use the recovery target timeline ID during recovery */if (am_cascading_walsender) ThisTimeLineID = GetRecoveryTargetTLI(); Comment in GetRecoveryTargetTLI() does this: /* RecoveryTargetTLI doesn't change so we need no lock to copy it */return XLogCtl->RecoveryTargetTLI; That comment is not true. RecoveryTargetTLI can change during recovery, if you set recovery_target_timeline='latest'. In 'latest' mode, when the (apparent) end of WAL is reached, the archive is scanned for any new timeline history files that may have appeared. If a new timeline is found, RecoveryTargetTLI is updated, and recovery is continued on the new timeline. Aside from the missing locking, I wonder what that does to a cascaded standby. If there is an active walsender running while RecoveryTargetTLI is changed, I think what will happen is that the walsender will continue to stream WAL from the old timeline, but because the startup process is now actually replaying from a different timeline, the walsender will send bogus WAL to the standby. When a standby ends recovery, creates a new timeline, and switches to normal operation, postmaster terminates all walsenders because of the timeline change. But don't we have a race condition there, with similar effect? It might take a while for a walsender to die, and in that window, it might send bogus WAL to the cascaded standby. - Heikki
В списке pgsql-hackers по дате отправления: