Re: [PATCH] Hex-coding optimizations using SVE on ARM.
От | John Naylor |
---|---|
Тема | Re: [PATCH] Hex-coding optimizations using SVE on ARM. |
Дата | |
Msg-id | CANWCAZZMaxfO3CLr1_ViUZMknp_M9iJ0LJbw63E4xMRanGiSqg@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [PATCH] Hex-coding optimizations using SVE on ARM. (Nathan Bossart <nathandbossart@gmail.com>) |
Ответы |
Re: [PATCH] Hex-coding optimizations using SVE on ARM.
|
Список | pgsql-hackers |
On Fri, Sep 26, 2025 at 1:50 AM Nathan Bossart <nathandbossart@gmail.com> wrote: > > On Thu, Sep 25, 2025 at 09:16:35PM +0700, John Naylor wrote: > > (Separately, now I'm wondering if we can do the same for > > vector8_has_le since _mm_min_epu8 and vminvq_u8 both exist, and that > > would allow getting rid of ) > > I think so. I doubt there's any performance advantage, but it could be > nice for code cleanup. (I'm assuming you meant to say vector8_ssub > (renamed to vector8_ussub() in the patch) after "getting rid of.") Yes right, sorry. And it seems good to do such cleanup first, since it doesn't make sense to rename something that is about to be deleted. > I think I disagree on this one. We're not broadcasting arbitrary bytes for > every vector element, we're broadcasting a patten of bytes that happens to > be wider than the element size. I would expect this to be a relatively > common use-case. That's probably true. I'm still worried that the hack for working around compiler pickiness (while nice enough in it's current form) might break at some point and require awareness of compiler versions. Hmm, for this case, we can sidestep the maintainability questions entirely by instead using the new interleave functions to build the masks: vector8_interleave_low(vector8_zero(), vector8_broadcast(0x0f)) vector8_interleave_low(vector8_broadcast(0x0f), vector8_zero()) This generates identical code as v12 on Arm and is not bad on x86. What do you think of the attached? While looking around again, it looks like the "msk" variable isn't a mask like the implies to me. Not sure of a better name because I'm not sure what it represents aside from a temp variable. +#elif defined(USE_NEON) + switch (i) + { + case 4: + return (Vector8) vshrq_n_u32((Vector32) v1, 4); + case 8: + return (Vector8) vshrq_n_u32((Vector32) v1, 8); + default: + pg_unreachable(); + return vector8_broadcast(0); + } This is just a compiler hint, so if the input is not handled I think it will return the wrong answer rather than alerting the developer, so we probabaly want "Assert(false)" here. Other than that, the pack/unpack functions could use some documentation about which parameter is low/high. -- John Naylor Amazon Web Services
Вложения
В списке pgsql-hackers по дате отправления: