Re: Popcount optimization using AVX512

Поиск
Список
Период
Сортировка
От Alvaro Herrera
Тема Re: Popcount optimization using AVX512
Дата
Msg-id 202404011106.y4fci35kzdqt@alvherre.pgsql
обсуждение исходный текст
Ответ на Re: Popcount optimization using AVX512  (Nathan Bossart <nathandbossart@gmail.com>)
Ответы Re: Popcount optimization using AVX512  (Nathan Bossart <nathandbossart@gmail.com>)
Список pgsql-hackers
On 2024-Mar-31, Nathan Bossart wrote:

> +uint64
> +pg_popcount_avx512(const char *buf, int bytes)
> +{
> +    uint64        popcnt;
> +    __m512i        accum = _mm512_setzero_si512();
> +
> +    for (; bytes >= sizeof(__m512i); bytes -= sizeof(__m512i))
> +    {
> +        const        __m512i val = _mm512_loadu_si512((const __m512i *) buf);
> +        const        __m512i cnt = _mm512_popcnt_epi64(val);
> +
> +        accum = _mm512_add_epi64(accum, cnt);
> +        buf += sizeof(__m512i);
> +    }
> +
> +    popcnt = _mm512_reduce_add_epi64(accum);
> +    return popcnt + pg_popcount_fast(buf, bytes);
> +}

Hmm, doesn't this arrangement cause an extra function call to
pg_popcount_fast to be used here?  Given the level of micro-optimization
being used by this code, I would have thought that you'd have tried to
avoid that.  (At least, maybe avoid the call if bytes is 0, no?)

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"El Maquinismo fue proscrito so pena de cosquilleo hasta la muerte"
(Ijon Tichy en Viajes, Stanislaw Lem)



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Nazir Bilal Yavuz
Дата:
Сообщение: Re: Building with meson on NixOS/nixpkgs
Следующее
От: Pavel Borisov
Дата:
Сообщение: Re: Fix parameters order for relation_copy_for_cluster