Re: Load distributed checkpoint
От | Tom Lane |
---|---|
Тема | Re: Load distributed checkpoint |
Дата | |
Msg-id | 25553.1165555298@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: Load distributed checkpoint (Greg Smith <gsmith@gregsmith.com>) |
Список | pgsql-hackers |
Greg Smith <gsmith@gregsmith.com> writes: > On Fri, 8 Dec 2006, Takayuki Tsunakawa wrote: >> Though I'm not sure, isn't it the key to use O_SYNC so that write()s >> transfer data to disk? > If disk writes near checkpoint time aren't happening fast enough now, I > doubt forcing a sync after every write will make that better. I think the idea would be to force the writes to actually occur, rather than just being scheduled (and then forced en-masse by an fsync at checkpoint time). Since the point of the bgwriter is to try to force writes to occur *outside* checkpoint times, this seems to make sense. I share your doubts about the value of slowing down checkpoints --- but to the extent that bgwriter-issued writes are delayed by the kernel until the next checkpoint, we are certainly not getting the desired effect of leveling the write load. >> To decrease the count of I/O, pages adjacent on disk that >> are also adjacent on memory must be written with one write(). > Sorting out which pages are next to one another on disk is one of the jobs > the file system cache does; bypassing it will then make all that > complicated sorting logic the job of the database engine. Indeed --- the knowledge that we don't know the physical layout has always been the strongest argument against using O_SYNC in this way. But I don't think anyone's made any serious tests. A properly tuned bgwriter should be eating only a "background" level of I/O effort between checkpoints, so maybe it doesn't matter too much if it's not optimally scheduled. regards, tom lane
В списке pgsql-hackers по дате отправления: