Re: [HACKERS] Clock with Adaptive Replacement
От | Thomas Munro |
---|---|
Тема | Re: [HACKERS] Clock with Adaptive Replacement |
Дата | |
Msg-id | CAEepm=3p6TmW2oUFyp8zRnYVum1G72fwToDLRbv9kK_V1DU9xQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Clock with Adaptive Replacement (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>) |
Ответы |
Re: [HACKERS] Clock with Adaptive Replacement
(Andrey Borodin <x4mmm@yandex-team.ru>)
|
Список | pgsql-hackers |
On Fri, Feb 12, 2016 at 10:02 AM, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote: > Hi hackers, > > What do you think about improving cache replacement clock-sweep algorithm in > PostgreSQL with adaptive version proposed in this article: > > http://www-cs.stanford.edu/~sbansal/pubs/fast04.pdf > > Are there some well known drawbacks of this approach or it will be > interesting to adopt this algorithm to PostgreSQL and measure it impact om > performance under different workloads? I'm not currently planning to work in this area and have done no real investigation, so please consider the following to be "water cooler talk". I also came across that paper while reading about buffering as general background. Yeah, CAR[T] looks pretty interesting at first glance. Automatic scan resistance seems like something we'd want, its approach to frequency sounds smarter than GCLOCK's usage counter with arbitrary parameter 5. Here's a nice slide deck from the authors: http://www.cse.iitd.ernet.in/~sbansal/pubs/fast04ppt.pdf There doesn't seem to be as much written about GCLOCK beyond the 1992 TPC-A paper[1], possibly because as the CAR paper says "[a] fundamental disadvantage of GCLOCK is that it requires counter increment on every page hit which makes it infeasible for virtual memory", making it less interesting to the OS guys. That is, we actually have slightly more information, because we have an opportunity to bump the usage counter when pinning and the VM guys don't, probably explaining why the paper compares with CLOCK and not GCLOCK. One question is whether SPC-1 looks anything like a typical database workload. Say someone wrote a prototype CAR[T] patch for PostgreSQL, how would you validate it? I'm also curious about how the algorithms at different levels interact when using buffered IO. While other databases typically do direct IO, you can still find a trail of papers about this topic since disk arrays and network-attached storage systems also have caches and page replacement problems[2][3], and of course multi-level caching problems apply to non-database applications too. I wonder to what extent Linux's use of "a mash of a number of different algorithms with a number of modifications for catching corner cases and various optimisations"[4] affects the performance of different clock-based algorithms operating higher up. If we had a page replacement algorithm with CART's magical claimed properties and it really does perform better than GCLOCK + our scan resistance special cases, I wonder if it would be tempting to stick SLRU contents into shared buffers. slru.c could gain a smgr-compatible interface so that bufmgr.c could treat those buffers like any other, using smgr_which to select the appropriate storage manager. (I'm going to be proposing something similar for undo log storage, more on that later.) [1] http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.452.9699&rep=rep1&type=pdf [2] https://www.usenix.org/legacy/event/usenix01/full_papers/zhou/zhou.pdf [3] https://www.usenix.org/legacy/event/usenix02/full_papers/wong/wong_html/ [4] http://www.spinics.net/lists/linux-mm/msg13385.html -- Thomas Munro http://www.enterprisedb.com
В списке pgsql-hackers по дате отправления:
Предыдущее
От: David RowleyДата:
Сообщение: Re: bms_prev_member won't work correctly if bitmapword is 64-bits