Re: BUG #14999: pg_rewind corrupts control file global/pg_control
От | TipTop Labs |
---|---|
Тема | Re: BUG #14999: pg_rewind corrupts control file global/pg_control |
Дата | |
Msg-id | 0D104D1E-6103-49F4-8AB8-B5CD92949E18@tiptop-labs.com обсуждение исходный текст |
Ответ на | Re: BUG #14999: pg_rewind corrupts control file global/pg_control (Michael Paquier <michael.paquier@gmail.com>) |
Ответы |
Re: BUG #14999: pg_rewind corrupts control file global/pg_control
|
Список | pgsql-bugs |
> On Jan 5, 2018, at 2:26 AM, Michael Paquier <michael.paquier@gmail.com> wrote: > > On Fri, Jan 5, 2018 at 5:06 AM, PG Bug reporting form > <noreply@postgresql.org> wrote: >> I have encountered a bug in PostgreSQL 10.1: when the target directory for >> pg_rewind contains a read-only file (e.g. server.key), pg_rewind exits with >> "could not open target file" (legitimate) and corrupts the control file >> global/pg_control to size 0 (bug). From now on, pg_rewind always exits with >> "unexpected control file size 0, expected 8192" and a restore from >> pg_basebackup is needed. > > Likely that's reproducible down to 9.5 where pg_rewind has been > introduced. I agree that we should do better with failure handling > here. Corrupting the control file is not cool. I can already confirm that this also occurs with PostgreSQL 9.6. >> A patch for branch REL_10_STABLE of repository >> https://github.com/postgres/postgres, a README, and Dockerfiles for >> demonstrating both bug and patch are available from >> https://github.com/tiptop-labs/postgres-patches . > > You may not know when github.com is gone, which would cause the data > you are attaching here to go away. What looks interesting is > pg_rewind.patch, which is what you are proposing as a fix, right? I am > attaching it here for PG archives. Yes, it's pg_rewind.patch. There is a bit of rationale in README, feel free to also attach here if/as needed. > > - open_target_file(filename, false); > + /* Trunc target file for action FILE_ACTION_COPY. */ > + open_target_file(filename, chunkoff == 0); > > write_target_range(chunk, chunkoff, chunksize); > Hm. Let me think more about that as there are quite a few > distributions that link to SSL files that cannot be written, and > pg_rewind copies all configuration files in full. Could you please identify a specific distribution with such properties; I'd like to have a look there too. Thanks. > - "FROM fetchchunks\n"; > + "FROM fetchchunks\n" > + "ORDER BY path, begin\n"; > If this is aimed at improving the performance of pg_rewind by making > chunk writes more sequential, it should be a different patch. I would > see value in that. If this is aimed to change the order of the files > to avoid the write corruption, this is wrong. It may be wrong then; from README: Truncation of the file was moved to the second loop. Truncation occurs there if chunksare written into files at offset 0. This is the case for FILE_ACTION_COPY. An additional SQL "ORDER BY path, begin"ensures that these chunks are processed first. > I would imagine that you could see something similar with the offline > mode, haven't tested yet though. > -- > Michael > <pg_rewind.patch> -- Christian
В списке pgsql-bugs по дате отправления: