Re: O_DIRECT for relations and SLRUs (Prototype)
От | Thomas Munro |
---|---|
Тема | Re: O_DIRECT for relations and SLRUs (Prototype) |
Дата | |
Msg-id | CAEepm=01B6YsdMBR9i3K8MyBAAVH1SacTaB4c+skJ3XUc7w+dA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: O_DIRECT for relations and SLRUs (Prototype) (Andrey Borodin <x4mmm@yandex-team.ru>) |
Ответы |
Re: O_DIRECT for relations and SLRUs (Prototype)
Re: O_DIRECT for relations and SLRUs (Prototype) |
Список | pgsql-hackers |
On Sun, Jan 13, 2019 at 5:13 AM Andrey Borodin <x4mmm@yandex-team.ru> wrote: > > Hi! > > > 12 янв. 2019 г., в 9:46, Michael Paquier <michael@paquier.xyz> написал(а): > > > > Attached is a toy patch that I have begun using for tests in this > > area. That's nothing really serious at this stage, but you can use > > that if you would like to see the impact of O_DIRECT. Of course, > > things get significantly slower. > > Cool! > I've just gathered a group of students to task them with experimenting with shared buffer eviction algorithms during theirFebruary internship at Yandex-Sirius edu project. Your patch seems very handy for benchmarks in this area. +1, thanks for sharing the patch. Even though just turning on O_DIRECT is the trivial part of this project, it's good to encourage discussion. We may indeed become more sensitive to the quality of buffer eviction algorithms, but it seems like the main work to regain lost performance will be the background IO scheduling piece: 1. We need a new "bgreader" process to do read-ahead. I think you'd want a way to tell it with explicit hints (for example, perhaps sequential scans would advertise that they're reading sequentially so that it starts to slurp future blocks into the buffer pool, and streaming replicas might look ahead in the WAL and tell it what's coming). In theory this might be better than the heuristics OSes use to guess our access pattern and pre-fetch into the page cache, since we have better information (and of course we're skipping a buffer layer). 2. We need a new kind of bgwriter/syncer that aggressively creates clean pages so that foreground processes rarely have to evict (since that is now super slow), but also efficiently finds ranges of dirty blocks that it can write in big sequential chunks. 3. We probably want SLRUs to use the main buffer pool, instead of their own mini-pools, so they can benefit from the above. Whether we need multiple bgreader and bgwriter processes or perhaps a general IO scheduler process may depend on whether we also want to switch to async (multiplexing from a single process). Starting simple with a traditional sync IO and N processes seems OK to me. -- Thomas Munro http://www.enterprisedb.com
В списке pgsql-hackers по дате отправления: