Re: O_DIRECT in freebsd
От | Manfred Spraul |
---|---|
Тема | Re: O_DIRECT in freebsd |
Дата | |
Msg-id | 3FA14855.5020103@colorfullife.com обсуждение исходный текст |
Ответ на | Re: O_DIRECT in freebsd (Greg Stark <gsstark@mit.edu>) |
Список | pgsql-hackers |
Greg Stark wrote: >Manfred Spraul <manfred@colorfullife.com> writes: > > > >>One problem for WAL is that O_DIRECT would disable the write cache - >>each operation would block until the data arrived on disk, and that might block >>other backends that try to access WALWriteLock. >>Perhaps a dedicated backend that does the writeback could fix that. >> >> > >aio seems a better fit. > > > >>Has anyone tried to use posix_fadvise for the wal logs? >>http://www.opengroup.org/onlinepubs/007904975/functions/posix_fadvise.html >> >>Linux supports posix_fadvise, it seems to be part of xopen2k. >> >> > >Odd, I don't see it anywhere in the kernel. I don't know what syscall it's >using to do this tweaking. > > At least in 2.6: linux/mm/fadvise.c, the syscall is fadvise64 or 64_64 >This is the only option that seems useful for postgres for both the WAL and >vacuum (though in other threads it seems the problems with vacuum lie >elsewhere): > > POSIX_FADV_DONTNEED attempts to free cached pages associated with the > specified region. This is useful, for example, while streaming large > files. A program may periodically request the kernel to free cached > data that has already been used, so that more useful cached pages are > not discarded instead. > > Pages that have not yet been written out will be unaffected, so if the > application wishes to guarantee that pages will be released, it should > call fsync or fdatasync first. > > I agree. Either immediately after each flush syscall, or just before closing a log file and switching to the next. >Perhaps POSIX_FADV_RANDOM and POSIX_FADV_SEQUENTIAL could be useful in a >backend before starting a sequential scan or index scan, but I kind of doubt >it. > > IIRC the recommendation is ~20% total memory for the postgres user space buffers. That's quite a lot - it might be sufficient to protect that cache from vacuum or sequential scans. AddBufferToFreeList already contains a comment that this is the right place to try buffer replacement strategies. -- Manfred
В списке pgsql-hackers по дате отправления: