hot backups: am I doing it wrong, or do we have a problem with pg_clog?
От | Daniel Farina |
---|---|
Тема | hot backups: am I doing it wrong, or do we have a problem with pg_clog? |
Дата | |
Msg-id | BANLkTi=j-k3QOFKpjxUG5m0FtihANz3tOw@mail.gmail.com обсуждение исходный текст |
Ответы |
Re: hot backups: am I doing it wrong, or do we have a
problem with pg_clog?
Re: hot backups: am I doing it wrong, or do we have a problem with pg_clog? Re: hot backups: am I doing it wrong, or do we have a problem with pg_clog? |
Список | pgsql-hackers |
To start at the end of this story: "DETAIL: Could not read from file "pg_clog/007D" at offset 65536: Success." This is a message we received on a a standby that we were bringing online as part of a test. The clog file was present, but apparently too small for Postgres (or at least I tihnk this is what the message meant), so one could stub in another clog file and then continue recovery successfully (modulus the voodoo of stubbing in clog files in general). I am unsure if this is due to an interesting race condition in Postgres or a result of my somewhat-interesting hot-backup protocol, which is slightly more involved than the norm. I will describe what it does here: 1) Call pg start backup 2) crawl the entire postgres cluster directory structure, except pg_xlog, taking notes of the size of every file present 3) begin writing TAR files, but *only up to the size noted during the original crawling of the cluster directory,* so if the file grows between the original snapshot and subsequently actually calling read() on the file those extra bytes will not be added to the TAR. 3a) If a file is truncated partially, I add "\0" bytes to padthe tarfile member up to the size sampled in step 2, as I am streaming the tar file and cannot go back in the stream and adjust the tarfile member size 4) call pg stop backup The reason I go to this trouble is because I use many completely disjoint tar files to do parallel compression, decompression, uploading, and downloading of the base backup of the database, and I want to be able to control the size of these files up-front. The requirement of stubbing in \0 is because of a limitation of the tar format when dealing with streaming archives and the requirement to truncate the files to the size snapshotted in the step 2 is to enable splitting up the files between volumes even in the presence of possible concurrent growth while I'm performing the hot backup. (ex: a handful of nearly-empty heap files can rapidly grow due to a concurrent bulk load if I get unlucky, which I do not intend to allow myself to be). Any ideas? Or does it sound like I'm making some bookkeeping errors and should review my code again? It does work most of the time. I have not gotten a sense how often this reproduces just yet. -- fdr
В списке pgsql-hackers по дате отправления: