Re: [HACKERS] O_DIRECT for WAL writes

Поиск
Список
Период
Сортировка
От ITAGAKI Takahiro
Тема Re: [HACKERS] O_DIRECT for WAL writes
Дата
Msg-id 20050727140214.460E.ITAGAKI.TAKAHIRO@lab.ntt.co.jp
обсуждение исходный текст
Ответ на Re: [HACKERS] O_DIRECT for WAL writes  (Bruce Momjian <pgman@candle.pha.pa.us>)
Ответы Re: [HACKERS] O_DIRECT for WAL writes  (Bruce Momjian <pgman@candle.pha.pa.us>)
Re: [HACKERS] O_DIRECT for WAL writes  (Bruce Momjian <pgman@candle.pha.pa.us>)
Список pgsql-patches
Thanks for reviewing!
But the patch does not work on HEAD, because of the changes in BootStrapXLOG().
I send the patch with a fix for it.


Bruce Momjian <pgman@candle.pha.pa.us> wrote:

> If you are doing fsync(), I don't see how O_DIRECT
> makes any sense because O_DIRECT is writing to disk on every write, and
> then what is the fsync() actually doing.

It's depends on OSes. Manpage of Linux says,
  http://linux.com.hk/PenguinWeb/manpage.jsp?name=open§ion=2
    File I/O is done directly to/from user space buffers. The I/O is
    synchronous, i.e., at the completion of the read(2) or write(2) system
    call, data is **guaranteed to have been transferred**.
But manpage of FreeBSD says,
  http://www.manpages.info/freebsd/open.2.html
    O_DIRECT may be used to minimize or eliminate the cache effects of read-
    ing and writing.  The system will attempt to avoid caching the data you
    read or write.  If it cannot avoid caching the data,
    it will **minimize the impact the data has on the cache**.

In my understanding, the completion of write() with O_DIRECT does not always
assure an actual write. So there may be difference between O_DIRECT+O_SYNC
and O_DIRECT+fsync(), but I think that is not very often.


> What I did was to add O_DIRECT unconditionally for all uses of O_SYNC
> and O_DSYNC, so it is automatically used in those cases.  And of course,
> if your operating system doens't support O_DIRECT, it isn't used.

I agree with your way, where O_DIRECT is automatically used.
I bet the combination of O_DIRECT and O_SYNC is always better than
the case O_SYNC only used.

---
ITAGAKI Takahiro
NTT Cyber Space Laboratories


Вложения

В списке pgsql-patches по дате отправления:

Предыдущее
От: Neil Conway
Дата:
Сообщение: Re: pg_dump: fix crash on error
Следующее
От: ITAGAKI Takahiro
Дата:
Сообщение: Unused MMCacheLock