Re: Experimental patch for inter-page delay in VACUUM
От | Andrew Dunstan |
---|---|
Тема | Re: Experimental patch for inter-page delay in VACUUM |
Дата | |
Msg-id | 3FA7CAE6.1040402@dunslane.net обсуждение исходный текст |
Ответ на | Re: Experimental patch for inter-page delay in VACUUM (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: Experimental patch for inter-page delay in VACUUM
|
Список | pgsql-hackers |
Tom Lane wrote: >Jan Wieck <JanWieck@Yahoo.com> writes: > > >>What still needs to be addressed is the IO storm cause by checkpoints. I >>see it much relaxed when stretching out the BufferSync() over most of >>the time until the next one should occur. But the kernel sync at it's >>end still pushes the system hard against the wall. >> >> > >I have never been happy with the fact that we use sync(2) at all. Quite >aside from the "I/O storm" issue, sync() is really an unsafe way to do a >checkpoint, because there is no way to be certain when it is done. And >on top of that, it does too much, because it forces syncing of files >unrelated to Postgres. > >I would like to see us go over to fsync, or some other technique that >gives more certainty about when the write has occurred. There might be >some scope that way to allow stretching out the I/O, too. > >The main problem with this is knowing which files need to be fsync'd. >The only idea I have come up with is to move all buffer write operations >into a background writer process, which could easily keep track of >every file it's written into since the last checkpoint. This could cause >problems though if a backend wants to acquire a free buffer and there's >none to be had --- do we want it to wait for the background process to >do something? We could possibly say that backends may write dirty >buffers for themselves, but only if they fsync them immediately. As >long as this path is seldom taken, the extra fsyncs shouldn't be a big >performance problem. > >Actually, once you build it this way, you could make all writes >synchronous (open the files O_SYNC) so that there is never any need for >explicit fsync at checkpoint time. The background writer process would >be the one incurring the wait in most cases, and that's just fine. In >this way you could directly control the rate at which writes are issued, >and there's no I/O storm at all. (fsync could still cause an I/O storm >if there's lots of pending writes in a single file.) > > > Or maybe fdatasync() would be slightly more efficient - do we care about flushing metadata that much? cheers andrew
В списке pgsql-hackers по дате отправления: