Re: use SSE2 for is_valid_ascii
От | John Naylor |
---|---|
Тема | Re: use SSE2 for is_valid_ascii |
Дата | |
Msg-id | CAFBsxsFXym2h5LZiHUCP=WQzvDMeSgp1+A3UoBb0jtDdHTsWtQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: use SSE2 for is_valid_ascii (Nathan Bossart <nathandbossart@gmail.com>) |
Ответы |
Re: use SSE2 for is_valid_ascii
|
Список | pgsql-hackers |
On Thu, Aug 11, 2022 at 5:31 AM Nathan Bossart <nathandbossart@gmail.com> wrote: > > This is a neat patch. I don't know that we need an entirely separate code > block for the USE_SSE2 path, but I do think that a little bit of extra > commentary would improve the readability. IMO the existing comment for the > zero accumulator has the right amount of detail. > > + /* > + * Set all bits in each lane of the error accumulator where input > + * bytes are zero. > + */ > + error_cum = _mm_or_si128(error_cum, > + _mm_cmpeq_epi8(chunk, _mm_setzero_si128())); Okay, I will think about the comments, thanks for looking. > I wonder if reusing a zero vector (instead of creating a new one every > time) has any noticeable effect on performance. Creating a zeroed register is just FOO PXOR FOO, which should get hoisted out of the (unrolled in this case) loop, and which a recent CPU will just map to a hard-coded zero in the register file, in which case the execution latency is 0 cycles. :-) -- John Naylor EDB: http://www.enterprisedb.com
В списке pgsql-hackers по дате отправления: