RE: Popcount optimization using AVX512
От | Amonson, Paul D |
---|---|
Тема | RE: Popcount optimization using AVX512 |
Дата | |
Msg-id | BL1PR11MB530478ACABE588A836338E74DC322@BL1PR11MB5304.namprd11.prod.outlook.com обсуждение исходный текст |
Ответ на | Re: Popcount optimization using AVX512 (David Rowley <dgrowleyml@gmail.com>) |
Ответы |
RE: Popcount optimization using AVX512
|
Список | pgsql-hackers |
> -----Original Message----- > From: David Rowley <dgrowleyml@gmail.com> > Sent: Wednesday, March 20, 2024 5:28 PM > To: Amonson, Paul D <paul.d.amonson@intel.com> > Cc: Nathan Bossart <nathandbossart@gmail.com>; Andres Freund > > I'm not sure about this "extern negates inline" comment. It seems to me the > compiler is perfectly free to inline a static function into an external function > and it's free to inline the static function elsewhere within the same .c file. > > The final sentence of the following comment that the 0001 patch removes > explains this: > > /* > * When the POPCNT instruction is not available, there's no point in using > * function pointers to vary the implementation between the fast and slow > * method. We instead just make these actual external functions when > * TRY_POPCNT_FAST is not defined. The compiler should be able to inline > * the slow versions here. > */ > > Also, have a look at [1]. You'll see f_slow() wasn't even compiled and the code > was just inlined into f(). I just added the > __attribute__((noinline)) so that usage() wouldn't just perform constant > folding and just return 6. > > I think, unless you have evidence that some common compiler isn't inlining the > static into the extern then we shouldn't add the macros. > It adds quite a bit of churn to the patch and will break out of core code as you > no longer have functions named pg_popcount32(), > pg_popcount64() and pg_popcount(). This may be a simple misunderstanding extern != static. If I use the "extern" keyword then a symbol *will* be generated andinline will be ignored. This is NOT true of "static inline", where the compiler will try to inline the method. :) In this patch set: * I removed the macro implementation. * Made everything that could possibly be inlined marked with the "static inline" keyword. * Conditionally made the *_slow() functions "static inline" when TRY_POPCONT_FAST is not set. * Found and fixed some whitespace errors in the AVX code implementation. Thanks, Paul
Вложения
В списке pgsql-hackers по дате отправления: