Re: [RFC,PATCH] Avoid manual shift-and-test logic in AllocSetFreeIndex
От | Atsushi Ogawa |
---|---|
Тема | Re: [RFC,PATCH] Avoid manual shift-and-test logic in AllocSetFreeIndex |
Дата | |
Msg-id | 4A2895C4.9050108@hi-ho.ne.jp обсуждение исходный текст |
Ответ на | Re: [RFC,PATCH] Avoid manual shift-and-test logic in AllocSetFreeIndex (Jeremy Kerr <jk@ozlabs.org>) |
Ответы |
Re: [RFC,PATCH] Avoid manual shift-and-test logic in AllocSetFreeIndex
|
Список | pgsql-hackers |
Hi, > Also, are you still seeing the same improvement with the __builtin_clz > as your inline asm implementation? In my benchmark program, it is a little different performance in fls implementation and inline asm implementation. However, the result of a pgbench is almost the same improvement. Here is the result of my benchmark. Xeon(Core architecture) bytes : 4 8 16 32 64 128 256 512 1024 mix original : 0.780 0.790 0.8200.870 0.930 0.980 1.040 1.080 1.140 0.910 inline asm: 0.320 0.180 0.190 0.180 0.190 0.180 0.190 0.180 0.190 0.170 fls : 0.270 0.260 0.290 0.290 0.290 0.290 0.290 0.300 0.290 0.380 Xeon(P4 architecrure) bytes : 4 8 16 32 64 128 256 512 1024 mix original : 0.520 0.520 0.6700.780 0.950 1.000 1.060 1.190 1.250 0.940 inline asm: 0.610 0.530 0.530 0.520 0.520 0.540 0.540 0.580 0.540 0.600 fls : 0.390 0.370 0.780 0.780 0.780 0.790 0.780 0.780 0.780 0.520 pgbench result (measured by oprofile) CPU: Xeon(P4 architecrure) test program: pgbench -c 1 -t 50000 (fsync=off) original samples % symbol name 66854 6.6725 AllocSetAlloc 11817 1.1794 AllocSetFree inline asm samples % symbol name 47610 4.9333 AllocSetAlloc 6248 0.6474 AllocSetFree fls samples % symbol name 48779 4.9954 AllocSetAlloc 7648 0.7832 AllocSetFree Best regards, --- Atsushi Ogawa
В списке pgsql-hackers по дате отправления: