Re: Spread checkpoint sync
От | Greg Smith |
---|---|
Тема | Re: Spread checkpoint sync |
Дата | |
Msg-id | 4CFB29A3.1060002@2ndquadrant.com обсуждение исходный текст |
Ответ на | Re: Spread checkpoint sync (Greg Stark <gsstark@mit.edu>) |
Список | pgsql-hackers |
Greg Stark wrote: > Using sync_file_range you can specify the set of blocks to sync and > then block on them only after some time has passed. But there's no > documentation on how this relates to the I/O scheduler so it's not > clear it would have any effect on the problem. I believe this is the exact spot we're stalled at in regards to getting this improved on the Linux side, as I understand it at least. *The* answer for this class of problem on Linux is to use sync_file_range, and I don't think we'll ever get any sympathy from those kernel developers until we do. But that's a Linux specific call, so doing that is going to add a write path fork with platform-specific code into the database. If I thought sync_file_range was a silver bullet guaranteed to make this better, maybe I'd go for that. I think there's some relatively low-hanging fruit on the database side that would do better before going to that extreme though, thus the patch. > We might still have to delay the begining of the sync to allow the dirty blocks to be synced > naturally and then when we issue it still end up catching a lot of > other i/o as well. > Whether it's "lots" or not is really workload dependent. I work from the assumption that the blocks being written out by the checkpoint are the most popular ones in the database, the ones that accumulate a high usage count and stay there. If that's true, my guess is that the writes being done while the checkpoint is executing are a bit less likely to be touching the same files. You raise a valid concern, I just haven't seen that actually happen in practice yet. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services and Support www.2ndQuadrant.us
В списке pgsql-hackers по дате отправления: