Re: Warm-cache prefetching
От | Qingqing Zhou |
---|---|
Тема | Re: Warm-cache prefetching |
Дата | |
Msg-id | dnfsv4$1r7m$1@news.hub.org обсуждение исходный текст |
Ответ на | Warm-cache prefetching (Qingqing Zhou <zhouqq@cs.toronto.edu>) |
Список | pgsql-hackers |
"Simon Riggs" <simon@2ndquadrant.com> wrote > > You may be trying to use the memory too early. Prefetched memory takes > time to arrive in cache, so you may need to issue prefetch calls for N > +2, N+3 etc rather than simply N+1. > > p.6-11 covers this. > I actually tried it and no improvements have been observed. Also, this may conflict with "try to mix prefetch with computation" suggestion from the manual that you pointed out. But anyway, this looks like fixable compared to the following "prefetch distance" problem. As I read from the manual, this is one key factor of the efficiency, which also matches our intuition. However, when we process each tuple on a page, CPU clocks that are needed might be quite different: ---for (each tuple on a page){ if (ItemIdIsUsed(lpp)) /* some stopped here */ { ... /* some involves deeper functioncalls here */ valid = HeapTupleSatisfiesVisibility(&loctup, snapshot, buffer); if (valid) scan->rs_vistuples[ntup++]= lineoff; }} --- So it is pretty hard to predicate the prefetch distance. The prefetch improvements to memcpy/memmove does not have this problem, the prefecth distance can be fixed, and it does not change due to the different speed CPUs of the same processor serials. Maybe L2 cache is big enough so no need to worry about fetch too ahead? Seems not true, since this idea is vulnerable to a busy system. No data in L2 will be saved for you for a long time. As Luke suggested, the code above scan operators like sort might be a better place to look at. I will take a look there. Regards, Qingqing
В списке pgsql-hackers по дате отправления: