Re: Is pg_control file crashsafe?
| От | Thomas Munro |
|---|---|
| Тема | Re: Is pg_control file crashsafe? |
| Дата | |
| Msg-id | CAEepm=0hh_Dvd2Q+fcjYpkVzSoNX2+f167cYu5nwu=qh5HZhJw@mail.gmail.com обсуждение исходный текст |
| Ответ на | Re: Is pg_control file crashsafe? (Tom Lane <tgl@sss.pgh.pa.us>) |
| Ответы |
Re: Is pg_control file crashsafe?
|
| Список | pgsql-hackers |
On Thu, May 5, 2016 at 4:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Amit Kapila <amit.kapila16@gmail.com> writes: >> How about using 512 bytes as a write size and perform direct writes rather >> than going via OS buffer cache for control file? > > Wouldn't that fail outright under a lot of implementations of direct write; > ie the request needs to be page-aligned, for some not-very-determinate > value of page size? > > To repeat, I'm pretty hesitant to change this logic. While this is not > the first report we've ever heard of loss of pg_control, I believe I could > count those reports without running out of fingers on one hand --- and > that's counting since the last century. It will take quite a lot of > evidence to convince me that some other implementation will be more > reliable. If you just come and present a patch to use direct write, or > rename, or anything else for that matter, I'm going to reject it out of > hand unless you provide very strong evidence that it's going to be more > reliable than the current code across all the systems we support. I'm not sure how those ideas address the reported problem anyway: the *length* was unexpectedly zero after a crash. UpdateControlFile doesn't change the length of the control file, since it doesn't specify O_TRUNC or O_APPEND and it always writes the same size. So it seems like a pretty weird failure mode affecting filesystem metadata (which I wouldn't expect to change anyway, but I would expect to be journaled if it did), not a file-contents-atomicity problem. Whether or not the page cache is involved in a write to a preallocated file doesn't seem relevant to a case of unexpected truncation, and the atomic rename trick doesn't seem relevant either unless someone with expert knowledge of NTFS could explain how a crash could lead to truncation in the first place, and how rename would help. -- Thomas Munro http://www.enterprisedb.com
В списке pgsql-hackers по дате отправления: