Re: [PATCH] Hex-coding optimizations using SVE on ARM.
От | Nathan Bossart |
---|---|
Тема | Re: [PATCH] Hex-coding optimizations using SVE on ARM. |
Дата | |
Msg-id | aN63ZkWUEOsn6c01@nathan обсуждение исходный текст |
Ответ на | Re: [PATCH] Hex-coding optimizations using SVE on ARM. (John Naylor <johncnaylorls@gmail.com>) |
Список | pgsql-hackers |
On Mon, Sep 29, 2025 at 03:45:27PM +0700, John Naylor wrote: > On Fri, Sep 26, 2025 at 1:50 AM Nathan Bossart <nathandbossart@gmail.com> wrote: >> On Thu, Sep 25, 2025 at 09:16:35PM +0700, John Naylor wrote: >> > (Separately, now I'm wondering if we can do the same for >> > vector8_has_le since _mm_min_epu8 and vminvq_u8 both exist, and that >> > would allow getting rid of ) >> >> I think so. I doubt there's any performance advantage, but it could be >> nice for code cleanup. (I'm assuming you meant to say vector8_ssub >> (renamed to vector8_ussub() in the patch) after "getting rid of.") > > Yes right, sorry. And it seems good to do such cleanup first, since it > doesn't make sense to rename something that is about to be deleted. Will do. I'll plan on committing the other patch [0] soon. > Hmm, for this case, we can sidestep the maintainability questions > entirely by instead using the new interleave functions to build the > masks: > > vector8_interleave_low(vector8_zero(), vector8_broadcast(0x0f)) > vector8_interleave_low(vector8_broadcast(0x0f), vector8_zero()) > > This generates identical code as v12 on Arm and is not bad on x86. > What do you think of the attached? WFM > While looking around again, it looks like the "msk" variable isn't a > mask like the implies to me. Not sure of a better name because I'm not > sure what it represents aside from a temp variable. Renamed to "tmp". > +#elif defined(USE_NEON) > + switch (i) > + { > + case 4: > + return (Vector8) vshrq_n_u32((Vector32) v1, 4); > + case 8: > + return (Vector8) vshrq_n_u32((Vector32) v1, 8); > + default: > + pg_unreachable(); > + return vector8_broadcast(0); > + } > > This is just a compiler hint, so if the input is not handled I think > it will return the wrong answer rather than alerting the developer, so > we probabaly want "Assert(false)" here. Fixed. > Other than that, the pack/unpack functions could use some > documentation about which parameter is low/high. Added. [0] https://postgr.es/m/attachment/182185/v3-0001-Optimize-vector8_has_le-on-AArch64.patch -- nathan
Вложения
В списке pgsql-hackers по дате отправления: