Re: Using POPCNT and other advanced bit manipulation instructions
От | Jose Luis Tallon |
---|---|
Тема | Re: Using POPCNT and other advanced bit manipulation instructions |
Дата | |
Msg-id | df95d1e8-8e63-256a-2d3e-ed2f8cff2f37@adv-solutions.net обсуждение исходный текст |
Ответ на | Using POPCNT and other advanced bit manipulation instructions (David Rowley <david.rowley@2ndquadrant.com>) |
Ответы |
Re: Using POPCNT and other advanced bit manipulation instructions
|
Список | pgsql-hackers |
On 20/12/18 6:53, David Rowley wrote: > Back in 2016 [1] there was some discussion about using the POPCNT > instruction to improve the performance of counting the number of bits > set in a word. Improving this helps various cases, such as > bms_num_members and also things like counting the allvisible and > frozen pages in the visibility map. > > [snip] > > I've put together a very rough patch to implement using POPCNT and the > leading and trailing 0-bit instructions to improve the performance of > bms_next_member() and bms_prev_member(). The correct function should > be determined on the first call to each function by way of setting a > function pointer variable to the most suitable supported > implementation. I've not yet gone through and removed all the > number_of_ones[] arrays to replace with a pg_popcount*() call. IMVHO: Please do not disregard potential optimization by the compiler around those calls.. o_0 That might explain the reduced performance improvement observed. Not that I can see any obvious alternative to your implementation right away ... > That > seems to have mostly been done in Thomas' patch [3], part of which > I've used for the visibilitymap.c code changes. If this patch proves > to be possible, then I'll look at including the other changes Thomas > made in his patch too. > > What I'm really looking for by posting now are reasons why we can't do > this. I'm also interested in getting some testing done on older > machines, particularly machines with processors that are from before > 2007, both AMD and Intel. I can offer a 2005-vintage Opteron 2216 rev3 (bought late 2007) to test on. Feel free to toss me some test code. cpuinfo flags: fpu de tsc msr pae mce cx8 apic mca cmov pat clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt lm 3dnowext 3dnow rep_good nopl extd_apicid eagerfpu pni cx16 hypervisor lahf_lm cmp_legacy 3dnowprefetch vmmcall > 2007-2008 seems to be around the time both > AMD and Intel added support for POPCNT and LZCNT, going by [4]. Thanks
В списке pgsql-hackers по дате отправления: