Re: Spinlocks, yet again: analysis and proposed patches
От | Gregory Maxwell |
---|---|
Тема | Re: Spinlocks, yet again: analysis and proposed patches |
Дата | |
Msg-id | e692861c05091520486eb9307c@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Spinlocks, yet again: analysis and proposed patches (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: Spinlocks, yet again: analysis and proposed patches
|
Список | pgsql-hackers |
On 9/15/05, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Yesterday's CVS tip: > 1 32s 2 46s 4 88s 8 168s > plus no-cmpb and spindelay2: > 1 32s 2 48s 4 100s 8 177s > plus just-committed code to pad LWLock to 32: > 1 33s 2 50s 4 98s 8 179s > alter to pad to 64: > 1 33s 2 38s 4 108s 8 180s > > I don't know what to make of the 2-process time going down while > 4-process goes up; that seems just weird. But both numbers are > repeatable. It is odd. In the two process case there is, assuming random behavior, a 1/2 chance that you've already got the right line, but in the 4 process case only a 1/4 chance (since we're on a 4 way box). This would explain why we don't see as much cost in the intentionally misaligned case. You'd expect the a similar pattern of improvement with the 64byte alignment (some in the two process case, but more in the 4 case), but here we see more improvement in the two way case. If I had to guess I might say that the 64byte alignment is removing much of the unneeded line bouncing in the the two process case but is at the same time creating more risk of bouncing caused by aliasing. Since two processes have 1/2 chance the aliasing isn't a problem so the change is a win, but in the four process case it's no longer a win because with aliasing there is still a lot of fighting over the cache lines even if you pack well, and the decrease in packing makes odd aliasing somewhat more likely. This might also explain why the misaligned case performed so poorly in the 4process case, since the misalignment didn't just increase the cost 2x, it also increased the likelihood of a bogus bounce due to aliasing.. If this is the case, then it may be possible through very careful memory alignment to make sure that no two high contention locks that are likely to be contended at once share the same line (through either aliasing or through being directly within the same line). Then again I could be completely wrong, my understanding of multiprocessor cache coherency is very limited, and I have no clue how cache aliasing fits into it... So the above is just uninformed conjecture.
В списке pgsql-hackers по дате отправления: