RE: Popcount optimization using AVX512
От | Amonson, Paul D |
---|---|
Тема | RE: Popcount optimization using AVX512 |
Дата | |
Msg-id | BL1PR11MB530473FB4E9CBD68C28896F7DC282@BL1PR11MB5304.namprd11.prod.outlook.com обсуждение исходный текст |
Ответ на | Re: Popcount optimization using AVX512 (Nathan Bossart <nathandbossart@gmail.com>) |
Ответы |
RE: Popcount optimization using AVX512
|
Список | pgsql-hackers |
> -----Original Message----- > From: Nathan Bossart <nathandbossart@gmail.com> > Sent: Friday, March 15, 2024 8:06 AM > To: Amonson, Paul D <paul.d.amonson@intel.com> > Cc: Andres Freund <andres@anarazel.de>; Alvaro Herrera <alvherre@alvh.no- > ip.org>; Shankaran, Akash <akash.shankaran@intel.com>; Noah Misch > <noah@leadboat.com>; Tom Lane <tgl@sss.pgh.pa.us>; Matthias van de > Meent <boekewurm+postgres@gmail.com>; pgsql- > hackers@lists.postgresql.org > Subject: Re: Popcount optimization using AVX512 > > Which test suite did you run? Those numbers seem potentially > indistinguishable from noise, which probably isn't great for such a large patch > set. I ran... psql -c "select bitcount(column) from table;" ...in a loop with "column" widths of 84, 4096, 8192, and 16384 containing random data. There DB has 1 million rows. In theloop before calling the select I have code to clear all system caches. If I omit the code to clear system caches the marginof error remains the same but the improvement percent changes from 1.2% to 14.6% (much less I/O when cached data isavailable). > I ran John Naylor's test_popcount module [0] with the following command on > an i7-1195G7: > > time psql postgres -c 'select drive_popcount(10000000, 1024)' > > Without your patches, this seems to take somewhere around 8.8 seconds. > With your patches, it takes 0.6 seconds. (I re-compiled and re-ran the tests a > couple of times because I had a difficult time believing the amount of > improvement.) When I tested the code outside postgres in a micro benchmark I got 200-300% improvements. Your results are interesting, asit implies more than 300% improvement. Let me do some research on the benchmark you referenced. However, in all cases itseems that there is no regression so should we move forward on merging while I run some more local tests? Thanks, Paul
В списке pgsql-hackers по дате отправления: