Re: Page replacement algorithm in buffer cache
От | Merlin Moncure |
---|---|
Тема | Re: Page replacement algorithm in buffer cache |
Дата | |
Msg-id | CAHyXU0ycV+S33vL7wM+78vWDb=x3X=B-JKpYb66eHzQr5n2tQw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Page replacement algorithm in buffer cache (Bruce Momjian <bruce@momjian.us>) |
Список | pgsql-hackers |
On Mon, Apr 1, 2013 at 3:32 PM, Bruce Momjian <bruce@momjian.us> wrote: > On Mon, Apr 1, 2013 at 11:55:07AM -0500, Merlin Moncure wrote: >> > In fact, BufFreelistLock is really misnamed, because for the most >> > part, the "free list" as we implement is going to be empty. What the >> > BufFreelistLock is really doing is serializing the process of scanning >> > for a free buffer. I think THAT is the problem. If we could arrange >> > things so as to hold BufFreelistLock only for the amount of time >> > needed to remove a buffer from a freelist ... we'd probably buy >> > ourselves quite a bit. >> >> right. I'm imaging a buffer scan loop that looks something like >> (uncompiled, untested) this. "TryLockBufHdr" does a simple TAS >> without spin, returning the lock state (well, true if it acquired the >> lock). usage_count is specifically and deliberately adjusted without >> having a lock on the buffer header (this would require some careful >> testing and possible changes elsewhere): > > TAS does a CPU 'lock' instruction which affects the cpu cache. Why not > just read the value with no lock? check again, that's exactly what it does. Note the old implementation did a LockBufHdr() before examining refcount. The key logic is here: if (buf->refcount == 0) { if (buf->usage_count > 0) { buf->usage_count--; trycounter = NBuffers; } else { if (TryLockBufHdr(buf) { So we do an unlocked read of refcount and immediately bail if the buffer is "locked" according to the cpu cache. Then we check (still unlocked) usage_count and decrement it: Our adjustment may be lost, but so what? Finally, we attempt one (and only one) cache line lock (via TAS_SPIN) of the buffer and again bail if we see any problems there. Thus, it's impossible to get stuck in a potentially yielding spin while holding the free list lock. I dub this: "The Frightened Turtle" strategy of buffer allocation. It's an idea based on my research trying to solve Vlad's issue of having server stalls during read-only loads (see here: http://postgresql.1045698.n5.nabble.com/High-SYS-CPU-need-advise-td5732045.html) for a general backgrounder. The idea may not actually fix his issue, or there may be other aggravating aspects such as the always-capricious linux scheduler, but I'm suspicious. merlin
В списке pgsql-hackers по дате отправления: