On Thu, Nov 23, 2023 at 1:49 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
>
> On Wed, Nov 22, 2023 at 02:54:13PM +0200, Ants Aasma wrote:
> > For reference, executing the page checksum 10M times on a AMD 3900X CPU:
> >
> > clang-14 -O2 4.292s (17.8 GiB/s)
> > clang-14 -O2 -msse4.1 2.859s (26.7 GiB/s)
> > clang-14 -O2 -msse4.1 -mavx2 1.378s (55.4 GiB/s)
>
> Nice. I've noticed similar improvements with AVX2 intrinsics in simd.h.
If you're thinking to support AVX2 anywhere, I'd start with checksum
first. Much less code to review, and less risk.