Re: [BUG] pg_basebackup from disconnected standby fails

Поиск
Список
Период
Сортировка
От Michael Paquier
Тема Re: [BUG] pg_basebackup from disconnected standby fails
Дата
Msg-id CAB7nPqTv5gmKQcNDoFGTGqoqXz2xLz4RRw247oqOJzZTVy6-7Q@mail.gmail.com
обсуждение исходный текст
Ответ на [BUG] pg_basebackup from disconnected standby fails  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Ответы Re: [BUG] pg_basebackup from disconnected standby fails  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Список pgsql-hackers
On Thu, Jun 9, 2016 at 9:55 PM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> Hello, I found that pg_basebackup from a replication standby
> fails after the following steps, on 9.3 and the master.
>
> - start a replication master
> - start a replication standby
> - stop the master in the mode other than immediate.
>
> pg_basebackup to the standby will fail with the following error.
>
>> pg_basebackup: could not get transaction log end position from server:
>> ERROR:  could not find any WAL files

Indeed, and you could just do the following to reproduce the failure
with the recovery test suite, so I would suggest adding this test in
the patch:
--- a/src/test/recovery/t/001_stream_rep.pl
+++ b/src/test/recovery/t/001_stream_rep.pl
@@ -24,6 +24,11 @@ $node_standby_1->start;
 # pg_basebackup works on a standby).
 $node_standby_1->backup($backup_name);

+# Take a second backup of the standby while the master is offline.
+$node_master->stop;
+$node_standby_1->backup('my_backup_2');
+$node_master->start;
+

> After looking more closely, I found that the minRecoveryPoint
> tends to be too small as the backup end point, and up to the
> record at the lastReplayedRecPtr can affect the pages on disk and
> they can go into the backup just taken.
>
> My conclusion here is that do_pg_stop_backup should return
> lastReplayedRecPtr, not minRecoveryPoint.

I have been thinking quite a bit about this patch, and this logic
sounds quite right to me. When stopping the backup we need to let the
user know up to which point it needs to replay WAL, and relation pages
are touched up to lastReplayedEndRecPtr. This LSN could be greater
than the minimum recovery point as there is no record to mark the end
of the backup, and pg_control has normally, well surely, being backup
up last but that's not a new problem it would as well have been backup
up before the minimum recovery point has been reached...

Still, perhaps we could improve the documentation regarding that? Say
it is recommended to enforce the minimum recovery point in pg_control
to the stop backup LSN to ensure that the backup recovers to a
consistent state when taking a backup from a standby.

I am attaching an updated patch with the test btw.
--
Michael

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Sridhar N Bamandlapally
Дата:
Сообщение: Online DW
Следующее
От: Masahiko Sawada
Дата:
Сообщение: Re: Reviewing freeze map code