Re: add AVX2 support to simd.h

Поиск
Список
Период
Сортировка
От John Naylor
Тема Re: add AVX2 support to simd.h
Дата
Msg-id CANWCAZafKPUBYdNdtqZLVxVJhSn-ONeo_tp1FsODcn7udjKwRQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: add AVX2 support to simd.h  (Nathan Bossart <nathandbossart@gmail.com>)
Ответы Re: add AVX2 support to simd.h  (Nathan Bossart <nathandbossart@gmail.com>)
Список pgsql-hackers
On Tue, Mar 19, 2024 at 10:16 AM Nathan Bossart
<nathandbossart@gmail.com> wrote:
>
> On Tue, Mar 19, 2024 at 10:03:36AM +0700, John Naylor wrote:
> > I took a brief look, and 0001 isn't quite what I had in mind. I can't
> > quite tell what it's doing with the additional branches and "goto
> > retry", but I meant something pretty simple:
>
> Do you mean 0002?  0001 just adds a 2-register loop for remaining elements
> once we've exhausted what can be processed with the 4-register loop.

Sorry, I was looking at v2 at the time.

> > - if short, do one element at a time and return
>
> 0002 does this.

That part looks fine.

> > - if long, do one block unconditionally, then round the start pointer
> > up so that "end - start" is an exact multiple of blocks, and loop over
> > them
>
> 0002 does the opposite of this.  That is, after we've completed as many
> blocks as possible, we move the iterator variable back to "end -
> block_size" and do one final iteration to cover all the remaining elements.

Sounds similar in principle, but it looks really complicated. I don't
think the additional loops and branches are a good way to go, either
for readability or for branch prediction. My sketch has one branch for
which loop to do, and then performs only one loop. Let's do the
simplest thing that could work. (I think we might need a helper
function to do the block, but the rest should be easy)



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Peter Eisentraut
Дата:
Сообщение: Re: Inconsistent printf placeholders
Следующее
От: jian he
Дата:
Сообщение: Re: Catalog domain not-null constraints