Re: [PATCH] Improve spinlock inline assembly for x86.
От | Andres Freund |
---|---|
Тема | Re: [PATCH] Improve spinlock inline assembly for x86. |
Дата | |
Msg-id | 20160118225047.GZ10941@awork2.anarazel.de обсуждение исходный текст |
Ответ на | Re: [PATCH] Improve spinlock inline assembly for x86. (Kevin Grittner <kgrittn@gmail.com>) |
Ответы |
Re: [PATCH] Improve spinlock inline assembly for x86.
|
Список | pgsql-hackers |
On 2016-01-18 16:14:05 -0600, Kevin Grittner wrote: > Unconvinced that we should do performance testing on a proposed > performance patch before accepting it I'm unconvinced that it makes sense to view this as a performance patch. And unconvinced that you can sanely measure it. The lock prefix is a one byte instruction prefix, and lock xchg, and xchg are exactly the same, leaving the instruction width aside. It's just a littlebit less work for the instruction decoder. The point about alignment and such is, that changing some code somewhere is likely to have a bigger performance impact than the actual effect of the removal of those few bytes. So when you benchmark, you'd just benchmark a slightly changed code layout. objdump -d build/postgres/dev-assert/vpath/src/backend/postgres |grep 'lock xchg'|head -n1 4b732f: f0 86 01 lock xchg %al,(%rcx) the f0 is the lock prefix. In total there's 22 of them in the postgres codebase, when compiled with my flags/compiler. I think it's unrealistic to benchmark slight codemovements on a regular basis, particularly using a large machine. There's just not enough time and hardware around for that. Now I'm equally unconvinced that it's worthwhile to do anything here. I just don't think benchmarking plays a role either way. >, that the changes in NUMA > scheduling in the Linux 3.8 kernel have a major effect on how well > our code performs at high concurrency on NUMA machines with a lot > of memory nodes That I believe immediately.
В списке pgsql-hackers по дате отправления: