Re: B-Tree support function number 3 (strxfrm() optimization)
От | Peter Geoghegan |
---|---|
Тема | Re: B-Tree support function number 3 (strxfrm() optimization) |
Дата | |
Msg-id | CAM3SWZRihtyehnA2P1Wwk=a1zRWrFy2BjW7Q+MWNCMfPHiTGkg@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: B-Tree support function number 3 (strxfrm() optimization) (Thom Brown <thom@linux.com>) |
Список | pgsql-hackers |
On Thu, Apr 3, 2014 at 12:19 PM, Thom Brown <thom@linux.com> wrote: > Looking good: > > -T 100 -n -f sort.sql > > Master: 21.670467 / 21.718653 (avg: 21.69456) > Patch: 66.888756 / 66.888756 (avg: 66.888756) These were almost exactly the same figures as I saw on my machine. However, when compiling with certain additional flags -- with CFLAGS="-O3 -march=native" -- I was able to squeeze more out of this. My machine has a recent Intel CPU, "Intel(R) Core(TM) i7-3520M". With these build settings the benchmark then averages about 75.5 tps across multiple runs, which I'd call a fair additional improvement. I tried this because I was specifically interested in the results of a memcmp implementation that uses SIMD. I believe that these flags make gcc/glibc use a memcmp implementation that takes advantage of SSE where supported (and various subsequent extensions). Although I didn't go to the trouble of verifying all this by going through the disassembly, or instrumenting the code in any way, that is my best guess as to what actually helped. I don't know how any of that might be applied to improve matters in the real world, which is why I haven't dived into this further, but it's worth being aware of. -- Peter Geoghegan
В списке pgsql-hackers по дате отправления: