Re: add AVX2 support to simd.h
От | John Naylor |
---|---|
Тема | Re: add AVX2 support to simd.h |
Дата | |
Msg-id | CANWCAZbphuJTDjusRBGWk1R-z8Z-kvjMjsC5X4A6rjTN54MOFw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: add AVX2 support to simd.h (Nathan Bossart <nathandbossart@gmail.com>) |
Ответы |
Re: add AVX2 support to simd.h
|
Список | pgsql-hackers |
On Tue, Mar 19, 2024 at 11:30 PM Nathan Bossart <nathandbossart@gmail.com> wrote: > > Sounds similar in principle, but it looks really complicated. I don't > > think the additional loops and branches are a good way to go, either > > for readability or for branch prediction. My sketch has one branch for > > which loop to do, and then performs only one loop. Let's do the > > simplest thing that could work. (I think we might need a helper > > function to do the block, but the rest should be easy) > > I tried to trim some of the branches, and came up with the attached patch. > I don't think this is exactly what you were suggesting, but I think it's > relatively close. My testing showed decent benefits from using 2 vectors > when there aren't enough elements for 4, so I've tried to keep that part > intact. I would caution against that if the benchmark is repeatedly running against a static number of elements, because the branch predictor will be right all the time (except maybe when it exits a loop, not sure). We probably don't need to go to the trouble to construct a benchmark with some added randomness, but we have be careful not to overfit what the test is actually measuring.
В списке pgsql-hackers по дате отправления: