Re: checkpoint writeback via sync_file_range
От | Andres Freund |
---|---|
Тема | Re: checkpoint writeback via sync_file_range |
Дата | |
Msg-id | 201201111351.38738.andres@anarazel.de обсуждение исходный текст |
Ответ на | Re: checkpoint writeback via sync_file_range (Florian Weimer <fweimer@bfk.de>) |
Список | pgsql-hackers |
On Wednesday, January 11, 2012 10:33:47 AM Florian Weimer wrote: > * Greg Smith: > > One idea I was thinking about here was building a little hash table > > inside of the fsync absorb code, tracking how many absorb operations > > have happened for whatever the most popular relation files are. The > > idea is that we might say "use sync_file_range every time <N> calls > > for a relation have come in", just to keep from ever accumulating too > > many writes to any one file before trying to nudge some of it out of > > there. The bat that keeps hitting me in the head here is that right > > now, a single fsync might have a full 1GB of writes to flush out, > > perhaps because it extended a table and then write more than that to > > it. And in everything but a SSD or giant SAN cache situation, 1GB of > > I/O is just too much to fsync at a time without the OS choking a > > little on it. > > Isn't this pretty much like tuning vm.dirty_bytes? We generally set it > to pretty low values, and seems to help to smoothen the checkpoints. If done correctly/way much more invasive you could only issue sync_file_range's to the areas of the file where checkpointing needs to happen and you could leave out e.g. hint bit only changes. Which could help to reduce the cost of checkpoints. Andres
В списке pgsql-hackers по дате отправления: