Re: Duplicate history file?
От | Tatsuro Yamada |
---|---|
Тема | Re: Duplicate history file? |
Дата | |
Msg-id | 9bd1cc76-5fb8-6954-dce2-ab8ca56642ef@nttcom.co.jp_1 обсуждение исходный текст |
Ответ на | Duplicate history file? (Kyotaro Horiguchi <horikyota.ntt@gmail.com>) |
Ответы |
Re: Duplicate history file?
|
Список | pgsql-hackers |
Hi Horiguchi-san, On 2021/05/31 16:58, Kyotaro Horiguchi wrote: > So, I started a thread for this topic diverged from the following > thread. > > https://www.postgresql.org/message-id/4698027d-5c0d-098f-9a8e-8cf09e36a555@nttcom.co.jp_1 > >> So, what should we do for the user? I think we should put some notes >> in postgresql.conf or in the documentation. For example, something >> like this: > > I'm not sure about the exact configuration you have in mind, but that > would happen on the cascaded standby in the case where the upstream > promotes. In this case, the history file for the new timeline is > archived twice. walreceiver triggers archiving of the new history > file at the time of the promotion, then startup does the same when it > restores the file from archive. Is it what you complained about? Thank you for creating a new thread and explaining this. We are not using cascade replication in our environment, but I think the situation is similar. As an overview, when I do a promote, the archive_command fails due to the history file. I've created a reproduction script that includes building replication, and I'll share it with you. (I used Robert's test.sh as a reference for creating the reproduction script. Thanks) The scenario (sr_test_historyfile.sh) is as follows. #1 Start pgprimary as a main #2 Create standby #3 Start pgstandby as a standby #4 Execute archive command #5 Shutdown pgprimary #6 Start pgprimary as a standby #7 Promote pgprimary #8 Execute archive_command again, but failed since duplicate history file exists (see pgstandby.log) Note that this may not be appropriate if you consider it as a recovery procedure for replication configuration. However, I'm sharing it as it is because this seems to be the procedure used in the customer's environment (PG-REX). > The same workaround using the alternative archive script works for the > case. > > We could check pg_wal before fetching archive, however, archiving is > not controlled so strictly that duplicate archiving never happens and > I think we choose possible duplicate archiving than having holes in > archive. (so we suggest the "test ! -f" script) > >> ==== >> Note: If you use archive_mode=always, the archive_command on the >> standby side should not be used "test ! -f". >> ==== > > It could be one workaround. However, I would suggest not to overwrite > existing files (with a file with different content) to protect archive > from corruption. > > We might need to write that in the documentation... I think you're right, replacing it with an alternative archive script that includes the cmp command will resolve the error. The reason is that I checked with the diff command that the history files are identical. ===== $ diff -s pgprimary/arc/00000002.history pgstandby/arc/00000002.history Files pgprimary/arc/00000002.history and pgstandby/arc/00000002.history are identical ===== Regarding "test ! -f", I am wondering how many people are using the test command for archive_command. If I remember correctly, the guide provided by NTT OSS Center that we are using does not recommend using the test command. Regards, Tatsuro Yamada
Вложения
В списке pgsql-hackers по дате отправления: