Re: Why we are going to have to go DirectIO
От | Jim Nasby |
---|---|
Тема | Re: Why we are going to have to go DirectIO |
Дата | |
Msg-id | 52A4E0F5.1090008@nasby.net обсуждение исходный текст |
Ответ на | Re: Why we are going to have to go DirectIO (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-hackers |
On 12/5/13 9:59 AM, Tom Lane wrote: > Greg Stark <stark@mit.edu> writes: >> I think the way to use mmap would be to mmap very large chunks, >> possibly whole tables. We would need some way to control page flushes >> that doesn't involve splitting mappings and can be efficiently >> controlled without having the kernel storing arbitrarily large tags on >> page tables or searching through all the page tables to mark pages >> flushable. > > I might be missing something, but AFAICS mmap's API is just fundamentally > wrong for this. The kernel is allowed to write-back a modified mmap'd > page to the underlying file at any time, and will do so if say it's under > memory pressure. You can tell the kernel to sync now, but you can't tell > it *not* to sync. I suppose you are thinking that some wart could be > grafted onto that API to reverse that, but I wouldn't have a lot of > confidence in it. Any VM bug that caused the kernel to sometimes write > too soon would result in nigh unfindable data consistency hazards. Something else to ponder on... a Segate researcher gave a talk on upcoming hard drive technology it RICON East this spring.The interesting bit is that 1 or 2 generations down the road HDs will start using "shingling": The write head hasto be bigger than the read head, so they're going to set it up so you can not modify a range of tracks after they've beenwritten. They'll do this by keeping a journal inside the HD. This is somewhat similar to how SSDs work too (you can onlyerase large pages of data, you can't update individual bytes/sectors/filesystem blocks. So long-term, random access updates to permanent storage will be less efficient than today. (Of course, non-volatile memorycould turn all this on it's head..) -- Jim C. Nasby, Data Architect jim@nasby.net 512.569.9461 (cell) http://jim.nasby.net
В списке pgsql-hackers по дате отправления: