pg_upgrade (and recovery) pitfalls
От | PO |
---|---|
Тема | pg_upgrade (and recovery) pitfalls |
Дата | |
Msg-id | 20180816154110.Horde.KE3OkcY8--6IOgbaQ37A8gt@nextcloud.pro-open.de обсуждение исходный текст |
Ответы |
Re: pg_upgrade (and recovery) pitfalls
|
Список | pgsql-general |
Hello -general! (didn't want to pester -hackers with this, as I don't know if I've been doing something terribly wrong ;-) Consider the following scenario/setup: - 4 DB servers in 2 DCs - 1 primary (in DC1) - 1 sync secondary (in other DC) - 2 async secondaries (distributed over DCs) - 1 server running BARMAN for backups/archiving - Puppet takes care of the PG config, recovery.conf etc. - ssh connections are only allowed and enabled to and from the BARMAN host/user, no ssh between the DB hosts. - as we switch between servers pretty often, BARMAN provides a single archive for all 4 servers. Only the primary does WALarchiving. - General requirements are: - *always* have one sync secondary online (no exceptions) - best possible availability ==> shortest possible downtime The approach for major upgrades is to upgrade 2 servers and see if everything works as expected, then - either follow up with the remaining secondaries or - promote one of them and carry on with the old version Since we always have 2 spare secondaries to fall back to, pg_upgrade in "--link" mode is a viable approach, as it's niceand fast. The "naive" idea was to shutdown all instances (starting with the primary to enable final syncs), run "pg_upgrade -k" onboth the former primary and the former sync secondary, re-link the recovery.conf on the secondary, re-enable the "primary"IP and start both. D'oh! The secondary is complaining about a different cluster identifier: "FATAL: database system identifier differs between the primary and standby" (From looking at the code, I could not determine straight away when and how this identifier is generated, but I guess itis somewhere in initdb.c?) So, as we can't rsync (no ssh...), which would probably finish in a few seconds, a pg_basebackup is due. Which can be a PITAwhen the DB size is scraping on a TB and you have a single 1GB/sec connection. Bye, bye, availability (remember the primaryrequirement?). ==> So by now, we're only pg_upgrade'ing the primary and follow up with a pg_basebackup to the secondary, planning for muchlonger downtimes. <== After finishing the pg_basebackup, re-link the recovery.conf, start. The recovery finds history-files from higher timelines in the archive, starts to recover those (?) and then complains thatthe timeline doesn't match (don't take the numbers here too seriously, this is from a low-traffic test system, the forkoff TL 1 was at least a year ago): restored log file "00000002.history" from archive restored log file "00000003.history" from archive restored log file "00000004.history" from archive FATAL: requested timeline 3 is not a child of this server's history DETAIL: Latest checkpoint is at 9C/36044D28 on timeline 1, but in the history of the requested timeline, the server forkedoff from that timeline at 69/88000000. This mess can probably be cleaned up manually (delete the 000000[234].history etc. on both the secondary and the BARMAN archive),however to be 100% safe (or when you're unexperienced), you take another basebackup :/ And - after moving the *.history files out of the way in the archive - the secondary finally starts and starts receivingWALs. Sidenote: this second problem happened to us as well after a colleague promoted a secondary for some R/W tests w/out firstdisabling the archive_command in the morning and then re-built it using pg_basebackup in the afternoon. I have to say that it took me quite a while to figure out what was going on there... sudden timeline jumps, presumably outof nowhere?!? Now, questions/wishes: - did I do something obvious(ly) wrong? - why does a recovery, based on a recovery.conf that points to a reachable primary (which obviously communicates its owntimeline), still look for higher timelines' history-files in the archive and tries to jump onto these timelines? Thisdoesn't seem reasoable to me at all... - is there a way to have pg_upgrade/initdb use a particular "database system identifier" or some kind of "IV" for the newinstance, allowing for identical upgrades on primary and secondary? Apart from that number (and the system tables' OIDs),the data directories should be identical, as far as I understand pg_upgrade's functionality? (- and where do I find that identifier?) - is there a way to have pg_upgrade/initdb initiate the new cluster on the same (or a higher) timeline, to prevent f***ingup the archive? If not, I'd certainly appreciate such an option! (I'm well aware that I need a new basebackup afterthe upgrade anyway, but alas!) - any other hints (apart from rsync or other ssh-based methods) how I could get around the pg_basebackup would be highlyappreciated. Something that only clones the system tables and not the bulk of the (identical) data files? - is my situation (mandantory sync secondary) so unusual that nobody ever thought about the above? ;-) Cheers & best regards, Gunnar "Nick" Bluth
В списке pgsql-general по дате отправления:
Предыдущее
От: Adrian KlaverДата:
Сообщение: Re: Sv: Re: Sv: Re: regex match and special characters
Следующее
От: "Phil Endecott"Дата:
Сообщение: Re: During promotion, new master tries to archive same segmenttwice