[BUG] pg_basebackup from disconnected standby fails
| От | Kyotaro HORIGUCHI |
|---|---|
| Тема | [BUG] pg_basebackup from disconnected standby fails |
| Дата | |
| Msg-id | 20160609.215558.118976703.horiguchi.kyotaro@lab.ntt.co.jp обсуждение исходный текст |
| Ответы |
Re: [BUG] pg_basebackup from disconnected standby fails
|
| Список | pgsql-hackers |
Hello, I found that pg_basebackup from a replication standby fails after the following steps, on 9.3 and the master. - start a replication master - start a replication standby - stop the master in the mode other than immediate. pg_basebackup to the standby will fail with the following error. > pg_basebackup: could not get transaction log end position from > server: ERROR: could not find any WAL files The immediate cause is that do_pg_stop_backup returns an ealier LSN to do_pg_start_backup. The backup start point is the redo point of the last executed restart point. And the backup end point is the minRecoveryPoint at the call time. A restart point doesn't update the minRecoveryPoint when it is actually executed. Even though, ControlFile->checkPointCopy is updated to the redo point of the restart point just made. The minRecoveryPoint is too small as the backup end point on this situation. Thit is, end point can go behind the start point. This can be caused by the simple steps above but it also can be occur when pg_basebackup is connected after master's disconnection during a restart point. (With some other timing-dependet condition) So, the following comment in do_pg_stop_backup says as the following seems somewhat wrong. > * We return the current minimum recovery point as the backup end > * location. Note that it can be greater than the exact backup end > * location if the minimum recovery point is updated after the backup of > * pg_control. This is harmless for current uses. After looking more closely, I found that the minRecoveryPoint tends to be too small as the backup end point, and up to the record at the lastReplayedRecPtr can affect the pages on disk and they can go into the backup just taken. My conclusion here is that do_pg_stop_backup should return lastReplayedRecPtr, not minRecoveryPoint. The attached small patch does this on the master. The first problem is fixed by this for me. Any thoughts? # Sorry, but I'll be offline 'til Monday. regards, -- Kyotaro Horiguchi NTT Open Source Software Center
В списке pgsql-hackers по дате отправления: