Re: [PATCH] Hex-coding optimizations using SVE on ARM.
От | Nathan Bossart |
---|---|
Тема | Re: [PATCH] Hex-coding optimizations using SVE on ARM. |
Дата | |
Msg-id | aNRlYQoxTe8lOIEc@nathan обсуждение исходный текст |
Ответ на | Re: [PATCH] Hex-coding optimizations using SVE on ARM. (John Naylor <johncnaylorls@gmail.com>) |
Ответы |
Re: [PATCH] Hex-coding optimizations using SVE on ARM.
|
Список | pgsql-hackers |
On Wed, Sep 24, 2025 at 10:59:38AM +0700, John Naylor wrote: > + if (unlikely(!hex_decode_simd_helper(srcv, &dstv1))) > + break; > > But if you really want to do something here, sprinkling "(un)likely"'s > here seems like solving the wrong problem (even if they make any > difference), since the early return is optimizing for exceptional > conditions. In other places (cf. the UTF8 string verifier), we > accumulate errors, and only if we have them at the end do we restart > from the beginning with the slow error-checking path that can show the > user the offending input. I switched to an accumulator approach in v11. > +vector8_sssub(const Vector8 v1, const Vector8 v2) > > It's hard to parse "sss", so maybe we can borrow an Intel-ism and use > "iss" for the signed case? Done. > +/* vector manipulation */ > +#ifndef USE_NO_SIMD > +static inline Vector8 vector8_interleave_low(const Vector8 v1, const > Vector8 v2); > +static inline Vector8 vector8_interleave_high(const Vector8 v1, const > Vector8 v2); > +static inline Vector8 vector8_pack_16(const Vector8 v1, const Vector8 v2); > +static inline Vector32 vector32_shift_left_nibble(const Vector32 v1); > +static inline Vector32 vector32_shift_right_nibble(const Vector32 v1); > +static inline Vector32 vector32_shift_right_byte(const Vector32 v1); > > Do we need declarations for these? I recall that the existing > declarations are there for functions that are also used internally. Removed. > The nibble/byte things are rather specific. Wouldn't it be more > logical to expose the already-generic shift operations and let the > caller say by how much? Or does the compiler refuse because the > intrinsic doesn't get an immediate value? Some are like that, but I'm > not sure about these. If so, that's annoying and I wonder if there's a > workaround. Yeah, the compiler refuses unless the value is an integer literal. I thought of using a switch statement to cover all the values used in-tree, but I didn't like that, either. > +vector8_has_ge(const Vector8 v, const uint8 c) > +{ > +#ifdef USE_SSE2 > + Vector8 umax = _mm_max_epu8(v, vector8_broadcast(c)); > + Vector8 cmpe = _mm_cmpeq_epi8(umax, v); > + > + return vector8_is_highbit_set(cmpe); > > We take pains to avoid signed comparison on unsigned input for the > "le" case, and I don't see why it's okay here. _mm_max_epu8() does unsigned comparisons, I think... > Do the regression tests have long enough cases that test exceptional > paths, like invalid bytes and embedded whitespace? If not, we need > some. Added. I've also fixed builds on gcc/arm64, as discussed elsewhere [0]. Here are the current numbers on my laptop: arm buf | HEAD | patch | % diff -------+-------+-------+-------- 2 | 4 | 4 | 0 4 | 6 | 6 | 0 8 | 8 | 8 | 0 16 | 11 | 12 | -9 32 | 18 | 5 | 72 64 | 38 | 6 | 84 256 | 134 | 17 | 87 1024 | 513 | 63 | 88 4096 | 2081 | 262 | 87 16384 | 8524 | 1058 | 88 65536 | 34731 | 4224 | 88 [0] https://postgr.es/m/aNQtN89VW8z-yo3B%40nathan -- nathan
Вложения
В списке pgsql-hackers по дате отправления: