RE: Improve CRC32C performance on SSE4.2
От | Devulapalli, Raghuveer |
---|---|
Тема | RE: Improve CRC32C performance on SSE4.2 |
Дата | |
Msg-id | PH8PR11MB82866B07AA6758D12F699C00FB70A@PH8PR11MB8286.namprd11.prod.outlook.com обсуждение исходный текст |
Ответ на | Re: Improve CRC32C performance on SSE4.2 (John Naylor <johncnaylorls@gmail.com>) |
Список | pgsql-hackers |
Great catch! From the intrinsic manual: Cast vector of type __m128i to type __m512i; the upper 384 bits of the result are undefined. Replacing that with _mm512_zextsi128_si512 fixes the problem. > -----Original Message----- > From: Nathan Bossart <nathandbossart@gmail.com> > Sent: Monday, June 16, 2025 3:14 PM > To: Devulapalli, Raghuveer <raghuveer.devulapalli@intel.com> > Cc: John Naylor <johncnaylorls@gmail.com>; Andy Fan > <zhihuifan1213@163.com>; Jesper Pedersen <jesperpedersen.db@gmail.com>; > Tomas Vondra <tomas@vondra.me>; pgsql-hackers@lists.postgresql.org; > Shankaran, Akash <akash.shankaran@intel.com> > Subject: Re: Improve CRC32C performance on SSE4.2 > > On Mon, Jun 16, 2025 at 06:31:11PM +0000, Devulapalli, Raghuveer wrote: > > Attached is a simple reproducer. It passes with clang v16 -O0, but > > fails with 17 and 18 only when built with -O0.. > > I've just started looking into this, but the difference in code generated for > _mm512_castsi128_si512() between gcc, clang 16, and clang 17 looks interesting. > > -- > nathan
В списке pgsql-hackers по дате отправления: