Re: Proposal for enabling auto-vectorization for checksum calculations
От | John Naylor |
---|---|
Тема | Re: Proposal for enabling auto-vectorization for checksum calculations |
Дата | |
Msg-id | CANWCAZa1b2rcvoK657SmcKwh2P2cgASQ1D-0JPj5d3LbfaAVgA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Proposal for enabling auto-vectorization for checksum calculations (Andrew Kim <tenistarkim@gmail.com>) |
Список | pgsql-hackers |
On Thu, Sep 25, 2025 at 4:50 AM Andrew Kim <tenistarkim@gmail.com> wrote: > > Thanks, John. I see the issue now — I’ll attach the entire patch > series in a single email so it shows up properly in the commitfest and > gets CI coverage. It's still picking up v4, and the archive link doesn't show any further replies. Something must have happened with the email threading, since you weren't on the thread at first. Please create an account and edit the entry to point to a more recent message ID: https://commitfest.postgresql.org/patch/5726/ > Please find attached v6 of the patchset, updated per your feedback. Thanks. (BTW, we discourage top-posting and prefer to cut to size and use inline responses) This is not a complete review, but some architectural thoughts and some things I've noticed. The top of the checksum_impl.h has this: * This file exists for the benefit of external programs that may wish to * check Postgres page checksums. They can #include this to get the code * referenced by storage/checksum.h. (Note: you may need to redefine * Assert() as empty to compile this successfully externally.) It's going to be a bit tricky to preserve this ability while allowing the core server and client programs to dispatch to a specialized implementation, but we should at least try. That means keeping pg_checksum_block() and pg_checksum_page() where they live now. I think a good first refactoring patch would be to move src/backend/storage/checksum.c (which your patch doesn't even touch) to src/port (and src/include/storage/checksum.h to src/include/port) and have all callers use that. With that, I imagine only that checksum.c file would include checksum_impl.h. If that poses a problem, let us know -- we may have to further juggle things. If that works without issue, we can proceed with the specialization. On that, just a few things to note here, although the next patch doesn't need to worry about any of this yet: + #if defined(__has_attribute) && __has_attribute (target) + __attribute__((target("avx2"))) + #endif + static int avx2_test(void) + { + const char buf@<:@sizeof(__m256i)@:>@; + __m256i accum = _mm256_loadu_si256((const __m256i *) buf); + accum = _mm256_add_epi32(accum, accum); + int result = _mm256_extract_epi32(accum, 0); + return (int) result; + }], If we're just testing if the target works, we can just use an empty function, right? +#define PG_DECLARE_CHECKSUM_ISA(ISANAME) \ +uint32 \ +pg_checksum_block_##ISANAME(const PGChecksummablePage *page); + +#define PG_DEFINE_CHECKSUM_ISA(ISANAME) \ +pg_attribute_target(#ISANAME) \ +uint32 pg_checksum_block_##ISANAME(const PGChecksummablePage *page) \ I find this hard to read compared to just using the actual name. +avx2_available(void) +{ +#if defined (USE_AVX2_WITH_RUNTIME_CHECK) && defined(__x86_64__) Why guard on __x86_64__? +PG_DEFINE_CHECKSUM_ISA(default) +{ + uint32 sums[N_SUMS], result = 0; + uint32 i, j; [...] +#ifdef USE_AVX2_WITH_RUNTIME_CHECK +PG_DEFINE_CHECKSUM_ISA(avx2) +{ + uint32 sums[N_SUMS], result = 0; + uint32 i, j; [...] With the single src/port file idea above, these would just do "return pg_checksum_block()" (or pg_checksum_page, whichever makes more sense). + if (avx2_available()) + { + /* optional: patch pointer so next call goes directly */ + pg_checksum_block = pg_checksum_block_avx2; + return pg_checksum_block_avx2(page); + } Not sure what your referring to here by "patching" the pointer, but it sounds dangerous. Besides, the cost of indirection is basically zero for multi-kilobyte inputs, so there is not even any motivation to consider doing differently. -- John Naylor Amazon Web Services
В списке pgsql-hackers по дате отправления: