Re: add AVX2 support to simd.h

Поиск
Список
Период
Сортировка
От Nathan Bossart
Тема Re: add AVX2 support to simd.h
Дата
Msg-id 20240321170944.GA1767527@nathanxps13
обсуждение исходный текст
Ответ на Re: add AVX2 support to simd.h  (John Naylor <johncnaylorls@gmail.com>)
Ответы Re: add AVX2 support to simd.h  (Nathan Bossart <nathandbossart@gmail.com>)
Re: add AVX2 support to simd.h  (Nathan Bossart <nathandbossart@gmail.com>)
Re: add AVX2 support to simd.h  (John Naylor <johncnaylorls@gmail.com>)
Список pgsql-hackers
On Thu, Mar 21, 2024 at 11:30:30AM +0700, John Naylor wrote:
> I'm much happier about v5-0001. With a small tweak it would match what
> I had in mind:
> 
> + if (nelem < nelem_per_iteration)
> + goto one_by_one;
> 
> If this were "<=" then the for long arrays we could assume there is
> always more than one block, and wouldn't need to check if any elements
> remain -- first block, then a single loop and it's done.
> 
> The loop could also then be a "do while" since it doesn't have to
> check the exit condition up front.

Good idea.  That causes us to re-check all of the tail elements when the
number of elements is evenly divisible by nelem_per_iteration, but that
might be worth the trade-off.

> Yes, that spike is weird, because it seems super-linear. However, the
> more interesting question for me is: AVX2 isn't really buying much for
> the numbers covered in this test. Between 32 and 48 elements, and
> between 64 and 80, it's indistinguishable from SSE2. The jumps to the
> next shelf are postponed, but the jumps are just as high. From earlier
> system benchmarks, I recall it eventually wins out with hundreds of
> elements, right? Is that still true?

It does still eventually win, although not nearly to the same extent as
before.  I extended the benchmark a bit to show this.  I wouldn't be
devastated if we only got 0001 committed for v17, given these results.

> Further, now that the algorithm is more SIMD-appropriate, I wonder
> what doing 4 registers at a time is actually buying us for either SSE2
> or AVX2. It might just be a matter of scale, but that would be good to
> understand.

I'll follow up with these numbers shortly.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: [DOCS] HOT - correct claim about indexes not referencing old line pointers
Следующее
От: Nathan Bossart
Дата:
Сообщение: Re: add AVX2 support to simd.h