Re: silent data loss with ext4 / all current versions

Поиск
Список
Период
Сортировка
От Michael Paquier
Тема Re: silent data loss with ext4 / all current versions
Дата
Msg-id CAB7nPqRFz_g3YxJX=oWzqiW-tky+yhVPkaCy9GL1pHVPb8ZLKg@mail.gmail.com
обсуждение исходный текст
Ответ на Re: silent data loss with ext4 / all current versions  (Andres Freund <andres@anarazel.de>)
Ответы Re: silent data loss with ext4 / all current versions  (Andres Freund <andres@anarazel.de>)
Список pgsql-hackers
On Sat, Mar 5, 2016 at 7:35 AM, Andres Freund <andres@anarazel.de> wrote:
> On 2016-03-04 14:51:50 +0900, Michael Paquier wrote:
>> On Fri, Mar 4, 2016 at 4:06 AM, Andres Freund <andres@anarazel.de> wrote:
>> > I don't think we want any stat()s here. I'd much, much rather check open
>> > for ENOENT.
>>
>> OK. So you mean more or less that, right?
>> int fd;
>> fd = OpenTransientFile(newfile, PG_BINARY | O_RDONLY, 0);
>> if (fd < 0)
>> {
>>     if (errno != ENOENT)
>>         return -1;
>> }
>> else
>> {
>>     pg_fsync(fd);
>>     CloseTransientFile(fd);
>> }
>
> Yes. Otherwise the check is racy: The file could be gone by the time you
> do the fsync; leading to a spurious ERROR (which often would get
> promoted to a PANIC).

Yeah, that makes sense.

>> >> +/*
>> >> + * link_safe -- make a file hard link, making it on-disk persistent
>> >> + *
>> >> + * This routine ensures that a hard link created on a file persists on system
>> >> + * in case of a crash by using fsync where on the link generated as well as on
>> >> + * its parent directory.
>> >> + */
>> >> +int
>> >> +link_safe(const char *oldfile, const char *newfile)
>> >> +{
>> >
>> > If we go for a new abstraction here, I'd rather make it
>> > 'replace_file_safe' or something, and move the link/rename code #ifdef
>> > into it.
>>
>> Hm. OK. I don't see any reason why switching to link() even in code
>> paths like KeepFileRestoredFromArchive() or pgarch_archiveDone() would
>> be a problem thinking about it. Should HAVE_WORKING_LINK be available
>> on a platform we can combine it with unlink. Is that in line with what
>> you think?
>
> I wasn't trying to suggest we should replace all rename codepaths with
> the link wrapper, just the ones that already have a HAVE_WORKING_LINK
> check. The name of the routine I suggested is bad though...

So we'd introduce a first routine rename_or_link_safe(), say replace_safe().

>> Do you suggest to correct this comment to remove the mention to the
>> old file's parent directory because we just care about having the new
>> file as being persistent?
>
> That's one approach, yes. Combined with the fact that you can't actually
> reliably rename across directories, the two could be on different
> filesystems after all, that'd be a suitable defense. It just needs to be
> properly documented in the function header, not at the bottom.

OK. Got it. Or the two could be on the same filesystem. Still, link()
and rename() do not support doing their stuff on different filesystems
(EXDEV).
-- 
Michael



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Andres Freund
Дата:
Сообщение: Re: silent data loss with ext4 / all current versions
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: silent data loss with ext4 / all current versions