Обсуждение: Proposal for enabling auto-vectorization for checksum calculations

Поиск
Список
Период
Сортировка

Proposal for enabling auto-vectorization for checksum calculations

От
Matthew Sterrett
Дата:
Hello,
This patch enables more compiler autovectorization for the checksum
calculations.
This code is particularly well suited for autovectorization, so just
adding pg_attribute_target and some simple dynamic dispatch logic we
can get improved vectorization.
This gives about a 2x speedup in a synthetic benchmark for
pg_checksum, which is also included as a seperate patch file.

Additionally, another 2x performance increase in the synthetic
benchmark with AVX2 can be obtained if N_SUMS was changed to 64.
However, this would change the results of the checksum. This isn't
included in this patch, but I think it is worth considering for the
future

One additional factor, without explicitly passing some optimization
flag like -O2 the makefile build won't autovectorize any of the code.
However, the meson based build does this automatically.

Вложения

Re: Proposal for enabling auto-vectorization for checksum calculations

От
Matthew Sterrett
Дата:
Hello! I'm still trying to figure out those CI failures, I just wanted
to update things.

From my testing, with this patch repeatedly disabling/enabling
checksums is about 12.4% on an approximately 15 GB database.

By the way, I'd love it if anyone could help me figure out how to
replicate a CI failure in the Cirrus CI.
I haven't been able to figure out how to test CI runs locally, does
anyone know a good method to do that?



Re: Proposal for enabling auto-vectorization for checksum calculations

От
Stepan Neretin
Дата:


On Thu, May 8, 2025 at 6:57 AM Matthew Sterrett <matthewsterrett2@gmail.com> wrote:
Hello! I'm still trying to figure out those CI failures, I just wanted
to update things.

From my testing, with this patch repeatedly disabling/enabling
checksums is about 12.4% on an approximately 15 GB database.

By the way, I'd love it if anyone could help me figure out how to
replicate a CI failure in the Cirrus CI.
I haven't been able to figure out how to test CI runs locally, does
anyone know a good method to do that?



Hi Matthew,

Thanks for the patch!

I ran some timing tests:

(without avx2)

Time: 4034.351 ms
SELECT drive_pg_checksum(512);

(with avx2)

Time: 3559.076 ms
SELECT drive_pg_checksum(512);

Also attached two patches that should fix the CI issues.

Best,

Stepan Neretin


 

Re: Proposal for enabling auto-vectorization for checksum calculations

От
Matthew Sterrett
Дата:
Hello! Thanks for helping me with this.
I'm still trying to figure out what is going on with the Bookworm test
failures. I'm pretty sure this patchset should resolve all the issues
with the macOS build, but I don't think it will help the linux
failures unfortunately.

On Sat, May 10, 2025 at 4:02 AM Stepan Neretin <slpmcf@gmail.com> wrote:
>
>
>
> On Sat, May 10, 2025 at 6:01 PM Stepan Neretin <slpmcf@gmail.com> wrote:
>>
>>
>>
>> On Thu, May 8, 2025 at 6:57 AM Matthew Sterrett <matthewsterrett2@gmail.com> wrote:
>>>
>>> Hello! I'm still trying to figure out those CI failures, I just wanted
>>> to update things.
>>>
>>> From my testing, with this patch repeatedly disabling/enabling
>>> checksums is about 12.4% on an approximately 15 GB database.
>>>
>>> By the way, I'd love it if anyone could help me figure out how to
>>> replicate a CI failure in the Cirrus CI.
>>> I haven't been able to figure out how to test CI runs locally, does
>>> anyone know a good method to do that?
>>>
>>>
>>
>> Hi Matthew,
>>
>> Thanks for the patch!
>>
>> I ran some timing tests:
>>
>> (without avx2)
>>
>> Time: 4034.351 ms
>> SELECT drive_pg_checksum(512);
>>
>> (with avx2)
>>
>> Time: 3559.076 ms
>> SELECT drive_pg_checksum(512);
>>
>> Also attached two patches that should fix the CI issues.
>>
>> Best,
>>
>> Stepan Neretin
>>
>>
>>
>
> Oops, forgot to attach patches :)
>
> Best,
>
> Stepan Neretin
>
>

Вложения

Re: Proposal for enabling auto-vectorization for checksum calculations

От
Nazir Bilal Yavuz
Дата:
Hi,

On Tue, 20 May 2025 at 02:54, Matthew Sterrett
<matthewsterrett2@gmail.com> wrote:
>
> Hello! Thanks for helping me with this.
> I'm still trying to figure out what is going on with the Bookworm test
> failures. I'm pretty sure this patchset should resolve all the issues
> with the macOS build, but I don't think it will help the linux
> failures unfortunately.

You can see the failure at the artifacts ->
'log/tmp_install/log/install.log' file on the CI web page [1].

If you want to replicate that on your local:

$ ./configure --with-llvm CLANG="ccache clang-16"
$ make -s -j8 world-bin
$ make -j8 check-world

should be enough. I was able to replicate it with these commands. I
hope these help.

[1] https://cirrus-ci.com/task/4834162550505472

--
Regards,
Nazir Bilal Yavuz
Microsoft



Re: Proposal for enabling auto-vectorization for checksum calculations

От
Matthew Sterrett
Дата:
> You can see the failure at the artifacts ->
> 'log/tmp_install/log/install.log' file on the CI web page [1].
>
> If you want to replicate that on your local:
>
> $ ./configure --with-llvm CLANG="ccache clang-16"
> $ make -s -j8 world-bin
> $ make -j8 check-world
>
> should be enough. I was able to replicate it with these commands. I
> hope these help.
Thanks so much for helping me figure this out!

Okay, I've determined that versions of LLVM/Clang before 19 crash when
compiling this patch for some reason; it seems that both make
check-world and make install will crash with the affected LLVM
versions.
Unfortunately, what matters seems to be the version of the linker/LTO
optimizer, which I don't think we can check at compile time.
I added a check for Clang>=19 which works at preventing the crash on my system.
I think it's possible some unusual combination of clang/LLVM might
still crash during the build, but I think this is a reasonable
solution

Вложения

Re: Proposal for enabling auto-vectorization for checksum calculations

От
John Naylor
Дата:
On Fri, May 23, 2025 at 4:54 AM Matthew Sterrett
<matthewsterrett2@gmail.com> wrote:
> Okay, I've determined that versions of LLVM/Clang before 19 crash when
> compiling this patch for some reason; it seems that both make
> check-world and make install will crash with the affected LLVM
> versions.
> Unfortunately, what matters seems to be the version of the linker/LTO
> optimizer, which I don't think we can check at compile time.
> I added a check for Clang>=19 which works at preventing the crash on my system.
> I think it's possible some unusual combination of clang/LLVM might
> still crash during the build, but I think this is a reasonable
> solution

I don't know if this is related to the crashes, but it doesn't seem
like a good idea to #include the function pointer stuff everywhere,
that should probably go into src/port like the others.

--
John Naylor
Amazon Web Services



Re: Proposal for enabling auto-vectorization for checksum calculations

От
root
Дата:
Hi John,

Thanks for the feedback. This is v5 of the patchset, updated following your comments:

- Moved the function pointer definitions out of common headers and
  into src/port, consistent with existing practice.

Thanks again for the guidance.

Best regards,
Kim Andrew



[PATCH 1/2] Enable autovectorizing pg_checksum_block

От
root
Дата:
From: Andrew Kim <andrew.kim@intel.com>

---
 config/c-compiler.m4                |  31 +++++
 configure                           |  52 +++++++++
 configure.ac                        |   9 ++
 meson.build                         |  28 +++++
 src/include/pg_config.h.in          |   3 +
 src/include/storage/checksum_impl.h |  90 +++-----------
 src/port/Makefile                   |   1 +
 src/port/meson.build                |   1 +
 src/port/pg_checksum_dispatch.c     | 174 ++++++++++++++++++++++++++++
 9 files changed, 318 insertions(+), 71 deletions(-)
 create mode 100644 src/port/pg_checksum_dispatch.c

diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index da40bd6a647..5eb3218deb5 100644
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -711,6 +711,37 @@ fi
 undefine([Ac_cachevar])dnl
 ])# PGAC_XSAVE_INTRINSICS
 
+# PGAC_AVX2_SUPPORT
+# -----------------------------
+# Check if the compiler supports AVX2 in attribute((target))
+# and using AVX2 intrinsics in those functions
+#
+# If the intrinsics are supported, sets pgac_avx2_support.
+AC_DEFUN([PGAC_AVX2_SUPPORT],
+[define([Ac_cachevar], [AS_TR_SH([pgac_cv_avx2_support])])dnl
+AC_CACHE_CHECK([for AVX2 support], [Ac_cachevar],
+[AC_LINK_IFELSE([AC_LANG_PROGRAM([#include <immintrin.h>
+    #include <stdint.h>
+    #if defined(__has_attribute) && __has_attribute (target)
+    __attribute__((target("avx2")))
+    #endif
+    static int avx2_test(void)
+    {
+      const char buf@<:@sizeof(__m256i)@:>@;
+      __m256i accum = _mm256_loadu_si256((const __m256i *) buf);
+      accum = _mm256_add_epi32(accum, accum);
+      int result = _mm256_extract_epi32(accum, 0);
+      return (int) result;
+    }],
+  [return avx2_test();])],
+  [Ac_cachevar=yes],
+  [Ac_cachevar=no])])
+if test x"$Ac_cachevar" = x"yes"; then
+  pgac_avx2_support=yes
+fi
+undefine([Ac_cachevar])dnl
+])# PGAC_AVX2_SUPPORT
+
 # PGAC_AVX512_POPCNT_INTRINSICS
 # -----------------------------
 # Check if the compiler supports the AVX-512 popcount instructions using the
diff --git a/configure b/configure
index 39c68161cec..54da05ac0db 100755
--- a/configure
+++ b/configure
@@ -17608,6 +17608,58 @@ $as_echo "#define HAVE_XSAVE_INTRINSICS 1" >>confdefs.h
 
 fi
 
+# Check for AVX2 target and intrinsic support
+#
+if test x"$host_cpu" = x"x86_64"; then
+  { $as_echo "$as_me:${as_lineno-$LINENO}: checking for AVX2 support" >&5
+$as_echo_n "checking for AVX2 support... " >&6; }
+if ${pgac_cv_avx2_support+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+#include <immintrin.h>
+    #include <stdint.h>
+    #if defined(__has_attribute) && __has_attribute (target)
+    __attribute__((target("avx2")))
+    #endif
+    static int avx2_test(void)
+    {
+      const char buf[sizeof(__m256i)];
+      __m256i accum = _mm256_loadu_si256((const __m256i *) buf);
+      accum = _mm256_add_epi32(accum, accum);
+      int result = _mm256_extract_epi32(accum, 0);
+      return (int) result;
+    }
+int
+main ()
+{
+return avx2_test();
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_link "$LINENO"; then :
+  pgac_cv_avx2_support=yes
+else
+  pgac_cv_avx2_support=no
+fi
+rm -f core conftest.err conftest.$ac_objext \
+    conftest$ac_exeext conftest.$ac_ext
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv_avx2_support" >&5
+$as_echo "$pgac_cv_avx2_support" >&6; }
+if test x"$pgac_cv_avx2_support" = x"yes"; then
+  pgac_avx2_support=yes
+fi
+
+  if test x"$pgac_avx2_support" = x"yes"; then
+
+$as_echo "#define USE_AVX2_WITH_RUNTIME_CHECK 1" >>confdefs.h
+
+  fi
+fi
+
 # Check for AVX-512 popcount intrinsics
 #
 if test x"$host_cpu" = x"x86_64"; then
diff --git a/configure.ac b/configure.ac
index 066e3976c0a..2c484a12671 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2118,6 +2118,15 @@ if test x"$pgac_xsave_intrinsics" = x"yes"; then
   AC_DEFINE(HAVE_XSAVE_INTRINSICS, 1, [Define to 1 if you have XSAVE intrinsics.])
 fi
 
+# Check for AVX2 target and intrinsic support
+#
+if test x"$host_cpu" = x"x86_64"; then
+  PGAC_AVX2_SUPPORT()
+  if test x"$pgac_avx2_support" = x"yes"; then
+    AC_DEFINE(USE_AVX2_WITH_RUNTIME_CHECK, 1, [Define to 1 to use AVX2 instructions with a runtime check.])
+  fi
+fi
+
 # Check for AVX-512 popcount intrinsics
 #
 if test x"$host_cpu" = x"x86_64"; then
diff --git a/meson.build b/meson.build
index ab8101d67b2..ff42c41ca7e 100644
--- a/meson.build
+++ b/meson.build
@@ -2289,6 +2289,34 @@ int main(void)
 
 endif
 
+###############################################################
+# Check for the availability of AVX2 support
+###############################################################
+
+if host_cpu == 'x86_64'
+
+  prog = '''
+#include <immintrin.h>
+#include <stdint.h>
+#if defined(__has_attribute) && __has_attribute (target)
+__attribute__((target("avx2")))
+#endif
+int main(void)
+{
+  const char buf[sizeof(__m256i)];
+  __m256i accum = _mm256_loadu_si256((const __m256i *) buf);
+  accum = _mm256_add_epi32(accum, accum);
+  int result = _mm256_extract_epi32(accum, 0);
+  return (int) result;
+}
+'''
+
+  if cc.links(prog, name: 'AVX2 support', args: test_c_args)
+    cdata.set('USE_AVX2_WITH_RUNTIME_CHECK', 1)
+  endif
+
+endif
+
 
 ###############################################################
 # Check for the availability of AVX-512 popcount intrinsics.
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index c4dc5d72bdb..987f9b5c77c 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -675,6 +675,9 @@
 /* Define to 1 to use AVX-512 CRC algorithms with a runtime check. */
 #undef USE_AVX512_CRC32C_WITH_RUNTIME_CHECK
 
+/* Define to 1 to use AVX2 instructions with a runtime check. */
+#undef USE_AVX2_WITH_RUNTIME_CHECK
+
 /* Define to 1 to use AVX-512 popcount instructions with a runtime check. */
 #undef USE_AVX512_POPCNT_WITH_RUNTIME_CHECK
 
diff --git a/src/include/storage/checksum_impl.h b/src/include/storage/checksum_impl.h
index da87d61ba52..82e525529f4 100644
--- a/src/include/storage/checksum_impl.h
+++ b/src/include/storage/checksum_impl.h
@@ -101,12 +101,14 @@
  */
 
 #include "storage/bufpage.h"
+#include "pg_config.h"
 
 /* number of checksums to calculate in parallel */
 #define N_SUMS 32
 /* prime multiplier of FNV-1a hash */
 #define FNV_PRIME 16777619
 
+
 /* Use a union so that this code is valid under strict aliasing */
 typedef union
 {
@@ -142,74 +144,20 @@ do { \
  * Block checksum algorithm.  The page must be adequately aligned
  * (at least on 4-byte boundary).
  */
-static uint32
-pg_checksum_block(const PGChecksummablePage *page)
-{
-    uint32        sums[N_SUMS];
-    uint32        result = 0;
-    uint32        i,
-                j;
-
-    /* ensure that the size is compatible with the algorithm */
-    Assert(sizeof(PGChecksummablePage) == BLCKSZ);
-
-    /* initialize partial checksums to their corresponding offsets */
-    memcpy(sums, checksumBaseOffsets, sizeof(checksumBaseOffsets));
-
-    /* main checksum calculation */
-    for (i = 0; i < (uint32) (BLCKSZ / (sizeof(uint32) * N_SUMS)); i++)
-        for (j = 0; j < N_SUMS; j++)
-            CHECKSUM_COMP(sums[j], page->data[i][j]);
-
-    /* finally add in two rounds of zeroes for additional mixing */
-    for (i = 0; i < 2; i++)
-        for (j = 0; j < N_SUMS; j++)
-            CHECKSUM_COMP(sums[j], 0);
-
-    /* xor fold partial checksums together */
-    for (i = 0; i < N_SUMS; i++)
-        result ^= sums[i];
-
-    return result;
-}
-
-/*
- * Compute the checksum for a Postgres page.
- *
- * The page must be adequately aligned (at least on a 4-byte boundary).
- * Beware also that the checksum field of the page is transiently zeroed.
- *
- * The checksum includes the block number (to detect the case where a page is
- * somehow moved to a different location), the page header (excluding the
- * checksum itself), and the page data.
- */
-uint16
-pg_checksum_page(char *page, BlockNumber blkno)
-{
-    PGChecksummablePage *cpage = (PGChecksummablePage *) page;
-    uint16        save_checksum;
-    uint32        checksum;
-
-    /* We only calculate the checksum for properly-initialized pages */
-    Assert(!PageIsNew((Page) page));
-
-    /*
-     * Save pd_checksum and temporarily set it to zero, so that the checksum
-     * calculation isn't affected by the old checksum stored on the page.
-     * Restore it after, because actually updating the checksum is NOT part of
-     * the API of this function.
-     */
-    save_checksum = cpage->phdr.pd_checksum;
-    cpage->phdr.pd_checksum = 0;
-    checksum = pg_checksum_block(cpage);
-    cpage->phdr.pd_checksum = save_checksum;
-
-    /* Mix in the block number to detect transposed pages */
-    checksum ^= blkno;
-
-    /*
-     * Reduce to a uint16 (to fit in the pd_checksum field) with an offset of
-     * one. That avoids checksums of zero, which seems like a good idea.
-     */
-    return (uint16) ((checksum % 65535) + 1);
-}
+#define PG_DECLARE_CHECKSUM_ISA(ISANAME) \
+uint32 \
+pg_checksum_block_##ISANAME(const PGChecksummablePage *page);
+
+#define PG_DEFINE_CHECKSUM_ISA(ISANAME) \
+pg_attribute_target(#ISANAME) \
+uint32 pg_checksum_block_##ISANAME(const PGChecksummablePage *page) \
+
+/* Declare ISA implementations (declarations only in header) */
+#ifdef USE_AVX2_WITH_RUNTIME_CHECK
+PG_DECLARE_CHECKSUM_ISA(avx2);
+#endif
+PG_DECLARE_CHECKSUM_ISA(default);
+
+uint32 pg_checksum_block_dispatch(const PGChecksummablePage *page);
+extern uint32 (*pg_checksum_block)(const PGChecksummablePage *page);
+extern uint16 pg_checksum_page(char *page, BlockNumber blkno);
diff --git a/src/port/Makefile b/src/port/Makefile
index 4274949dfa4..27423f1058b 100644
--- a/src/port/Makefile
+++ b/src/port/Makefile
@@ -48,6 +48,7 @@ OBJS = \
     pg_numa.o \
     pg_popcount_aarch64.o \
     pg_popcount_avx512.o \
+    pg_checksum_dispatch.o \
     pg_strong_random.o \
     pgcheckdir.o \
     pgmkdirp.o \
diff --git a/src/port/meson.build b/src/port/meson.build
index fc7b059fee5..c4bbe9f2ece 100644
--- a/src/port/meson.build
+++ b/src/port/meson.build
@@ -11,6 +11,7 @@ pgport_sources = [
   'pg_numa.c',
   'pg_popcount_aarch64.c',
   'pg_popcount_avx512.c',
+  'pg_checksum_dispatch.c',
   'pg_strong_random.c',
   'pgcheckdir.c',
   'pgmkdirp.c',
diff --git a/src/port/pg_checksum_dispatch.c b/src/port/pg_checksum_dispatch.c
new file mode 100644
index 00000000000..15f7b8af34f
--- /dev/null
+++ b/src/port/pg_checksum_dispatch.c
@@ -0,0 +1,174 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_checksum_dispatch.c
+ *        Holds the AVX2 pg_popcount() implementation.
+ *
+ * Copyright (c) 2024-2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *        src/port/pg_checksum_dispatch.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "c.h"
+#include "storage/checksum_impl.h"
+
+#if defined(HAVE__GET_CPUID) || defined(HAVE__GET_CPUID_COUNT)
+#include <cpuid.h>
+#endif
+
+#ifdef HAVE_XSAVE_INTRINSICS
+#include <immintrin.h>
+#endif
+
+#if defined(HAVE__CPUID) || defined(HAVE__CPUIDEX)
+#include <intrin.h>
+#endif
+
+#include "port/pg_bitutils.h"
+
+/*
+ * Does CPUID say there's support for XSAVE instructions?
+ */
+static inline bool
+xsave_available(void)
+{
+    unsigned int exx[4] = {0, 0, 0, 0};
+
+#if defined(HAVE__GET_CPUID)
+    __get_cpuid(1, &exx[0], &exx[1], &exx[2], &exx[3]);
+#elif defined(HAVE__CPUID)
+    __cpuid(exx, 1);
+#elif defined(__x86_64__)
+#error cpuid instruction not available
+#endif
+    return (exx[2] & (1 << 27)) != 0;       /* osxsave */
+}
+
+/*
+ * Does XGETBV say the YMM registers are enabled?
+ *
+ * NB: Caller is responsible for verifying that xsave_available() returns true
+ * before calling this.
+ */
+#ifdef HAVE_XSAVE_INTRINSICS
+pg_attribute_target("xsave")
+#endif
+static inline bool
+ymm_regs_available(void)
+{
+#ifdef HAVE_XSAVE_INTRINSICS
+    return (_xgetbv(0) & 0x06) == 0x06;
+#else
+    return false;
+#endif
+}
+
+/*
+ * Does CPUID say there's support for AVX-2
+ */
+static inline bool
+avx2_available(void)
+{
+#if defined (USE_AVX2_WITH_RUNTIME_CHECK) && defined(__x86_64__)
+    unsigned int exx[4] = {0, 0, 0, 0};
+    if (!xsave_available() || !ymm_regs_available()) return false;
+
+#if defined(HAVE__GET_CPUID_COUNT)
+    __get_cpuid_count(7, 0, &exx[0], &exx[1], &exx[2], &exx[3]);
+#elif defined(HAVE__CPUIDEX)
+    __cpuidex(exx, 7, 0);
+#else
+#error cpuid instruction not available
+#endif
+    return (exx[1] & (1 << 5)) != 0; /* avx2 */
+#else
+    return false;
+#endif
+}
+
+
+/* default checksum implementation */
+PG_DEFINE_CHECKSUM_ISA(default)
+{
+    uint32 sums[N_SUMS], result = 0;
+    uint32 i, j;
+
+    Assert(sizeof(PGChecksummablePage) == BLCKSZ);
+    memcpy(sums, checksumBaseOffsets, sizeof(checksumBaseOffsets));
+
+    for (i = 0; i < (uint32)(BLCKSZ / (sizeof(uint32) * N_SUMS)); i++)
+        for (j = 0; j < N_SUMS; j++)
+            CHECKSUM_COMP(sums[j], page->data[i][j]);
+
+    for (i = 0; i < 2; i++)
+        for (j = 0; j < N_SUMS; j++)
+            CHECKSUM_COMP(sums[j], 0);
+
+    for (i = 0; i < N_SUMS; i++)
+        result ^= sums[i];
+
+    return result;
+}
+
+#ifdef USE_AVX2_WITH_RUNTIME_CHECK
+PG_DEFINE_CHECKSUM_ISA(avx2)
+{
+    uint32 sums[N_SUMS], result = 0;
+    uint32 i, j;
+
+    Assert(sizeof(PGChecksummablePage) == BLCKSZ);
+    memcpy(sums, checksumBaseOffsets, sizeof(checksumBaseOffsets));
+
+    for (i = 0; i < (uint32)(BLCKSZ / (sizeof(uint32) * N_SUMS)); i++)
+        for (j = 0; j < N_SUMS; j++)
+            CHECKSUM_COMP(sums[j], page->data[i][j]);
+
+    for (i = 0; i < 2; i++)
+        for (j = 0; j < N_SUMS; j++)
+            CHECKSUM_COMP(sums[j], 0);
+
+    for (i = 0; i < N_SUMS; i++)
+        result ^= sums[i];
+
+    return result;
+}
+#endif
+
+/* Function pointer - external linkage (declared extern in header) */
+uint32 (*pg_checksum_block)(const PGChecksummablePage *page) = pg_checksum_block_dispatch;
+
+/* Dispatch function: simple, safe */
+uint32 pg_checksum_block_dispatch(const PGChecksummablePage *page)
+{
+#ifdef USE_AVX2_WITH_RUNTIME_CHECK
+    if (avx2_available())
+    {
+        /* optional: patch pointer so next call goes directly */
+        pg_checksum_block = pg_checksum_block_avx2;
+        return pg_checksum_block_avx2(page);
+    }
+#endif
+    /* fallback */
+    pg_checksum_block = pg_checksum_block_default;
+    return pg_checksum_block_default(page);
+}
+
+
+/* Compute checksum for a Postgres page */
+uint16 pg_checksum_page(char *page, BlockNumber blkno)
+{
+    PGChecksummablePage *cpage = (PGChecksummablePage *) page;
+    uint16 save_checksum;
+    uint32 checksum;
+
+    Assert(!PageIsNew((Page) page));
+
+    save_checksum = cpage->phdr.pd_checksum;
+    cpage->phdr.pd_checksum = 0;
+    checksum = pg_checksum_block(cpage);
+    cpage->phdr.pd_checksum = save_checksum;
+
+    checksum ^= blkno;
+    return (uint16)((checksum % 65535) + 1);
+}
-- 
2.43.0




[PATCH 2/2] Benchmark code for postgres checksums

От
root
Дата:
From: Andrew kim <andrew.kim@intel.com>

---
 contrib/meson.build                           |  1 +
 contrib/pg_checksum_bench/meson.build         | 23 +++++++++++++
 .../pg_checksum_bench--1.0.sql                |  8 +++++
 contrib/pg_checksum_bench/pg_checksum_bench.c | 34 +++++++++++++++++++
 .../pg_checksum_bench.control                 |  4 +++
 .../sql/pg_checksum_bench.sql                 | 17 ++++++++++
 6 files changed, 87 insertions(+)
 create mode 100644 contrib/pg_checksum_bench/meson.build
 create mode 100644 contrib/pg_checksum_bench/pg_checksum_bench--1.0.sql
 create mode 100644 contrib/pg_checksum_bench/pg_checksum_bench.c
 create mode 100644 contrib/pg_checksum_bench/pg_checksum_bench.control
 create mode 100644 contrib/pg_checksum_bench/sql/pg_checksum_bench.sql

diff --git a/contrib/meson.build b/contrib/meson.build
index ed30ee7d639..fe5149aadff 100644
--- a/contrib/meson.build
+++ b/contrib/meson.build
@@ -12,6 +12,7 @@ contrib_doc_args = {
   'install_dir': contrib_doc_dir,
 }
 
+subdir('pg_checksum_bench')
 subdir('amcheck')
 subdir('auth_delay')
 subdir('auto_explain')
diff --git a/contrib/pg_checksum_bench/meson.build b/contrib/pg_checksum_bench/meson.build
new file mode 100644
index 00000000000..32ccd9efa0f
--- /dev/null
+++ b/contrib/pg_checksum_bench/meson.build
@@ -0,0 +1,23 @@
+# Copyright (c) 2022-2025, PostgreSQL Global Development Group
+
+pg_checksum_bench_sources = files(
+  'pg_checksum_bench.c',
+)
+
+if host_system == 'windows'
+  pg_checksum_bench_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+    '--NAME', 'pg_checksum_bench',
+    '--FILEDESC', 'pg_checksum_bench',])
+endif
+
+pg_checksum_bench = shared_module('pg_checksum_bench',
+  pg_checksum_bench_sources,
+  kwargs: contrib_mod_args,
+)
+contrib_targets += pg_checksum_bench
+
+install_data(
+  'pg_checksum_bench--1.0.sql',
+  'pg_checksum_bench.control',
+  kwargs: contrib_data_args,
+)
diff --git a/contrib/pg_checksum_bench/pg_checksum_bench--1.0.sql
b/contrib/pg_checksum_bench/pg_checksum_bench--1.0.sql
new file mode 100644
index 00000000000..5f13cbe3c5e
--- /dev/null
+++ b/contrib/pg_checksum_bench/pg_checksum_bench--1.0.sql
@@ -0,0 +1,8 @@
+/* contrib/pg_checksum_bench/pg_checksum_bench--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+-- \echo Use "CREATE EXTENSION pg_checksum_bench" to load this file. \quit
+
+CREATE FUNCTION drive_pg_checksum(page_count int)
+    RETURNS pg_catalog.void
+    AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/contrib/pg_checksum_bench/pg_checksum_bench.c b/contrib/pg_checksum_bench/pg_checksum_bench.c
new file mode 100644
index 00000000000..f40f335ff59
--- /dev/null
+++ b/contrib/pg_checksum_bench/pg_checksum_bench.c
@@ -0,0 +1,34 @@
+#include "postgres.h"
+#include "fmgr.h"
+#include "storage/checksum_impl.h"
+
+#include <stdio.h>
+#include <assert.h>
+
+PG_MODULE_MAGIC;
+
+#define REPEATS 1000000
+
+PG_FUNCTION_INFO_V1(drive_pg_checksum);
+Datum
+drive_pg_checksum(PG_FUNCTION_ARGS)
+{
+    int page_count = PG_GETARG_INT32(0);
+
+    PGChecksummablePage * pages = palloc(page_count * sizeof(PGChecksummablePage));
+    srand(0);
+    for (size_t i = 0; i < page_count * sizeof(PGChecksummablePage); i++){
+        char * byte_ptr = (char *) pages;
+        byte_ptr[i] = rand() % 256;
+    }
+
+    for (int i = 0; i < REPEATS; i++){
+        const PGChecksummablePage * test_page = pages + (i % page_count);
+        volatile uint32 result = pg_checksum_block(test_page);
+        (void) result;
+    }
+
+    pfree((void *) pages);
+
+    PG_RETURN_VOID();
+}
diff --git a/contrib/pg_checksum_bench/pg_checksum_bench.control b/contrib/pg_checksum_bench/pg_checksum_bench.control
new file mode 100644
index 00000000000..4a4e2c9363c
--- /dev/null
+++ b/contrib/pg_checksum_bench/pg_checksum_bench.control
@@ -0,0 +1,4 @@
+comment = 'pg_checksum benchmark'
+default_version = '1.0'
+module_pathname = '$libdir/pg_checksum_bench'
+relocatable = true
diff --git a/contrib/pg_checksum_bench/sql/pg_checksum_bench.sql b/contrib/pg_checksum_bench/sql/pg_checksum_bench.sql
new file mode 100644
index 00000000000..4b347699953
--- /dev/null
+++ b/contrib/pg_checksum_bench/sql/pg_checksum_bench.sql
@@ -0,0 +1,17 @@
+CREATE EXTENSION pg_checksum_bench;
+
+SELECT drive_pg_checksum(-1);
+
+\timing on
+
+SELECT drive_pg_checksum(1);
+SELECT drive_pg_checksum(2);
+SELECT drive_pg_checksum(4);
+SELECT drive_pg_checksum(8);
+SELECT drive_pg_checksum(16);
+SELECT drive_pg_checksum(32);
+SELECT drive_pg_checksum(64);
+SELECT drive_pg_checksum(128);
+SELECT drive_pg_checksum(256);
+SELECT drive_pg_checksum(512);
+SELECT drive_pg_checksum(1024);
-- 
2.43.0




Re: Proposal for enabling auto-vectorization for checksum calculations

От
tenistarkim@gmail.com
Дата:
Hi John,

Thanks for the feedback. This is v5 of the patchset, updated following your comments:

- Moved the function pointer definitions out of common headers and
  into src/port, consistent with existing practice.

Thanks again for the guidance.

Best regards,
Kim Andrew



[PATCH 1/2] Enable autovectorizing pg_checksum_block

От
tenistarkim@gmail.com
Дата:
From: Andrew Kim <andrew.kim@intel.com>

---
 config/c-compiler.m4                |  31 +++++
 configure                           |  52 +++++++++
 configure.ac                        |   9 ++
 meson.build                         |  28 +++++
 src/include/pg_config.h.in          |   3 +
 src/include/storage/checksum_impl.h |  90 +++-----------
 src/port/Makefile                   |   1 +
 src/port/meson.build                |   1 +
 src/port/pg_checksum_dispatch.c     | 174 ++++++++++++++++++++++++++++
 9 files changed, 318 insertions(+), 71 deletions(-)
 create mode 100644 src/port/pg_checksum_dispatch.c

diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index da40bd6a647..5eb3218deb5 100644
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -711,6 +711,37 @@ fi
 undefine([Ac_cachevar])dnl
 ])# PGAC_XSAVE_INTRINSICS
 
+# PGAC_AVX2_SUPPORT
+# -----------------------------
+# Check if the compiler supports AVX2 in attribute((target))
+# and using AVX2 intrinsics in those functions
+#
+# If the intrinsics are supported, sets pgac_avx2_support.
+AC_DEFUN([PGAC_AVX2_SUPPORT],
+[define([Ac_cachevar], [AS_TR_SH([pgac_cv_avx2_support])])dnl
+AC_CACHE_CHECK([for AVX2 support], [Ac_cachevar],
+[AC_LINK_IFELSE([AC_LANG_PROGRAM([#include <immintrin.h>
+    #include <stdint.h>
+    #if defined(__has_attribute) && __has_attribute (target)
+    __attribute__((target("avx2")))
+    #endif
+    static int avx2_test(void)
+    {
+      const char buf@<:@sizeof(__m256i)@:>@;
+      __m256i accum = _mm256_loadu_si256((const __m256i *) buf);
+      accum = _mm256_add_epi32(accum, accum);
+      int result = _mm256_extract_epi32(accum, 0);
+      return (int) result;
+    }],
+  [return avx2_test();])],
+  [Ac_cachevar=yes],
+  [Ac_cachevar=no])])
+if test x"$Ac_cachevar" = x"yes"; then
+  pgac_avx2_support=yes
+fi
+undefine([Ac_cachevar])dnl
+])# PGAC_AVX2_SUPPORT
+
 # PGAC_AVX512_POPCNT_INTRINSICS
 # -----------------------------
 # Check if the compiler supports the AVX-512 popcount instructions using the
diff --git a/configure b/configure
index 39c68161cec..54da05ac0db 100755
--- a/configure
+++ b/configure
@@ -17608,6 +17608,58 @@ $as_echo "#define HAVE_XSAVE_INTRINSICS 1" >>confdefs.h
 
 fi
 
+# Check for AVX2 target and intrinsic support
+#
+if test x"$host_cpu" = x"x86_64"; then
+  { $as_echo "$as_me:${as_lineno-$LINENO}: checking for AVX2 support" >&5
+$as_echo_n "checking for AVX2 support... " >&6; }
+if ${pgac_cv_avx2_support+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+#include <immintrin.h>
+    #include <stdint.h>
+    #if defined(__has_attribute) && __has_attribute (target)
+    __attribute__((target("avx2")))
+    #endif
+    static int avx2_test(void)
+    {
+      const char buf[sizeof(__m256i)];
+      __m256i accum = _mm256_loadu_si256((const __m256i *) buf);
+      accum = _mm256_add_epi32(accum, accum);
+      int result = _mm256_extract_epi32(accum, 0);
+      return (int) result;
+    }
+int
+main ()
+{
+return avx2_test();
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_link "$LINENO"; then :
+  pgac_cv_avx2_support=yes
+else
+  pgac_cv_avx2_support=no
+fi
+rm -f core conftest.err conftest.$ac_objext \
+    conftest$ac_exeext conftest.$ac_ext
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv_avx2_support" >&5
+$as_echo "$pgac_cv_avx2_support" >&6; }
+if test x"$pgac_cv_avx2_support" = x"yes"; then
+  pgac_avx2_support=yes
+fi
+
+  if test x"$pgac_avx2_support" = x"yes"; then
+
+$as_echo "#define USE_AVX2_WITH_RUNTIME_CHECK 1" >>confdefs.h
+
+  fi
+fi
+
 # Check for AVX-512 popcount intrinsics
 #
 if test x"$host_cpu" = x"x86_64"; then
diff --git a/configure.ac b/configure.ac
index 066e3976c0a..2c484a12671 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2118,6 +2118,15 @@ if test x"$pgac_xsave_intrinsics" = x"yes"; then
   AC_DEFINE(HAVE_XSAVE_INTRINSICS, 1, [Define to 1 if you have XSAVE intrinsics.])
 fi
 
+# Check for AVX2 target and intrinsic support
+#
+if test x"$host_cpu" = x"x86_64"; then
+  PGAC_AVX2_SUPPORT()
+  if test x"$pgac_avx2_support" = x"yes"; then
+    AC_DEFINE(USE_AVX2_WITH_RUNTIME_CHECK, 1, [Define to 1 to use AVX2 instructions with a runtime check.])
+  fi
+fi
+
 # Check for AVX-512 popcount intrinsics
 #
 if test x"$host_cpu" = x"x86_64"; then
diff --git a/meson.build b/meson.build
index ab8101d67b2..ff42c41ca7e 100644
--- a/meson.build
+++ b/meson.build
@@ -2289,6 +2289,34 @@ int main(void)
 
 endif
 
+###############################################################
+# Check for the availability of AVX2 support
+###############################################################
+
+if host_cpu == 'x86_64'
+
+  prog = '''
+#include <immintrin.h>
+#include <stdint.h>
+#if defined(__has_attribute) && __has_attribute (target)
+__attribute__((target("avx2")))
+#endif
+int main(void)
+{
+  const char buf[sizeof(__m256i)];
+  __m256i accum = _mm256_loadu_si256((const __m256i *) buf);
+  accum = _mm256_add_epi32(accum, accum);
+  int result = _mm256_extract_epi32(accum, 0);
+  return (int) result;
+}
+'''
+
+  if cc.links(prog, name: 'AVX2 support', args: test_c_args)
+    cdata.set('USE_AVX2_WITH_RUNTIME_CHECK', 1)
+  endif
+
+endif
+
 
 ###############################################################
 # Check for the availability of AVX-512 popcount intrinsics.
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index c4dc5d72bdb..987f9b5c77c 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -675,6 +675,9 @@
 /* Define to 1 to use AVX-512 CRC algorithms with a runtime check. */
 #undef USE_AVX512_CRC32C_WITH_RUNTIME_CHECK
 
+/* Define to 1 to use AVX2 instructions with a runtime check. */
+#undef USE_AVX2_WITH_RUNTIME_CHECK
+
 /* Define to 1 to use AVX-512 popcount instructions with a runtime check. */
 #undef USE_AVX512_POPCNT_WITH_RUNTIME_CHECK
 
diff --git a/src/include/storage/checksum_impl.h b/src/include/storage/checksum_impl.h
index da87d61ba52..82e525529f4 100644
--- a/src/include/storage/checksum_impl.h
+++ b/src/include/storage/checksum_impl.h
@@ -101,12 +101,14 @@
  */
 
 #include "storage/bufpage.h"
+#include "pg_config.h"
 
 /* number of checksums to calculate in parallel */
 #define N_SUMS 32
 /* prime multiplier of FNV-1a hash */
 #define FNV_PRIME 16777619
 
+
 /* Use a union so that this code is valid under strict aliasing */
 typedef union
 {
@@ -142,74 +144,20 @@ do { \
  * Block checksum algorithm.  The page must be adequately aligned
  * (at least on 4-byte boundary).
  */
-static uint32
-pg_checksum_block(const PGChecksummablePage *page)
-{
-    uint32        sums[N_SUMS];
-    uint32        result = 0;
-    uint32        i,
-                j;
-
-    /* ensure that the size is compatible with the algorithm */
-    Assert(sizeof(PGChecksummablePage) == BLCKSZ);
-
-    /* initialize partial checksums to their corresponding offsets */
-    memcpy(sums, checksumBaseOffsets, sizeof(checksumBaseOffsets));
-
-    /* main checksum calculation */
-    for (i = 0; i < (uint32) (BLCKSZ / (sizeof(uint32) * N_SUMS)); i++)
-        for (j = 0; j < N_SUMS; j++)
-            CHECKSUM_COMP(sums[j], page->data[i][j]);
-
-    /* finally add in two rounds of zeroes for additional mixing */
-    for (i = 0; i < 2; i++)
-        for (j = 0; j < N_SUMS; j++)
-            CHECKSUM_COMP(sums[j], 0);
-
-    /* xor fold partial checksums together */
-    for (i = 0; i < N_SUMS; i++)
-        result ^= sums[i];
-
-    return result;
-}
-
-/*
- * Compute the checksum for a Postgres page.
- *
- * The page must be adequately aligned (at least on a 4-byte boundary).
- * Beware also that the checksum field of the page is transiently zeroed.
- *
- * The checksum includes the block number (to detect the case where a page is
- * somehow moved to a different location), the page header (excluding the
- * checksum itself), and the page data.
- */
-uint16
-pg_checksum_page(char *page, BlockNumber blkno)
-{
-    PGChecksummablePage *cpage = (PGChecksummablePage *) page;
-    uint16        save_checksum;
-    uint32        checksum;
-
-    /* We only calculate the checksum for properly-initialized pages */
-    Assert(!PageIsNew((Page) page));
-
-    /*
-     * Save pd_checksum and temporarily set it to zero, so that the checksum
-     * calculation isn't affected by the old checksum stored on the page.
-     * Restore it after, because actually updating the checksum is NOT part of
-     * the API of this function.
-     */
-    save_checksum = cpage->phdr.pd_checksum;
-    cpage->phdr.pd_checksum = 0;
-    checksum = pg_checksum_block(cpage);
-    cpage->phdr.pd_checksum = save_checksum;
-
-    /* Mix in the block number to detect transposed pages */
-    checksum ^= blkno;
-
-    /*
-     * Reduce to a uint16 (to fit in the pd_checksum field) with an offset of
-     * one. That avoids checksums of zero, which seems like a good idea.
-     */
-    return (uint16) ((checksum % 65535) + 1);
-}
+#define PG_DECLARE_CHECKSUM_ISA(ISANAME) \
+uint32 \
+pg_checksum_block_##ISANAME(const PGChecksummablePage *page);
+
+#define PG_DEFINE_CHECKSUM_ISA(ISANAME) \
+pg_attribute_target(#ISANAME) \
+uint32 pg_checksum_block_##ISANAME(const PGChecksummablePage *page) \
+
+/* Declare ISA implementations (declarations only in header) */
+#ifdef USE_AVX2_WITH_RUNTIME_CHECK
+PG_DECLARE_CHECKSUM_ISA(avx2);
+#endif
+PG_DECLARE_CHECKSUM_ISA(default);
+
+uint32 pg_checksum_block_dispatch(const PGChecksummablePage *page);
+extern uint32 (*pg_checksum_block)(const PGChecksummablePage *page);
+extern uint16 pg_checksum_page(char *page, BlockNumber blkno);
diff --git a/src/port/Makefile b/src/port/Makefile
index 4274949dfa4..27423f1058b 100644
--- a/src/port/Makefile
+++ b/src/port/Makefile
@@ -48,6 +48,7 @@ OBJS = \
     pg_numa.o \
     pg_popcount_aarch64.o \
     pg_popcount_avx512.o \
+    pg_checksum_dispatch.o \
     pg_strong_random.o \
     pgcheckdir.o \
     pgmkdirp.o \
diff --git a/src/port/meson.build b/src/port/meson.build
index fc7b059fee5..c4bbe9f2ece 100644
--- a/src/port/meson.build
+++ b/src/port/meson.build
@@ -11,6 +11,7 @@ pgport_sources = [
   'pg_numa.c',
   'pg_popcount_aarch64.c',
   'pg_popcount_avx512.c',
+  'pg_checksum_dispatch.c',
   'pg_strong_random.c',
   'pgcheckdir.c',
   'pgmkdirp.c',
diff --git a/src/port/pg_checksum_dispatch.c b/src/port/pg_checksum_dispatch.c
new file mode 100644
index 00000000000..15f7b8af34f
--- /dev/null
+++ b/src/port/pg_checksum_dispatch.c
@@ -0,0 +1,174 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_checksum_dispatch.c
+ *        Holds the AVX2 pg_popcount() implementation.
+ *
+ * Copyright (c) 2024-2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *        src/port/pg_checksum_dispatch.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "c.h"
+#include "storage/checksum_impl.h"
+
+#if defined(HAVE__GET_CPUID) || defined(HAVE__GET_CPUID_COUNT)
+#include <cpuid.h>
+#endif
+
+#ifdef HAVE_XSAVE_INTRINSICS
+#include <immintrin.h>
+#endif
+
+#if defined(HAVE__CPUID) || defined(HAVE__CPUIDEX)
+#include <intrin.h>
+#endif
+
+#include "port/pg_bitutils.h"
+
+/*
+ * Does CPUID say there's support for XSAVE instructions?
+ */
+static inline bool
+xsave_available(void)
+{
+    unsigned int exx[4] = {0, 0, 0, 0};
+
+#if defined(HAVE__GET_CPUID)
+    __get_cpuid(1, &exx[0], &exx[1], &exx[2], &exx[3]);
+#elif defined(HAVE__CPUID)
+    __cpuid(exx, 1);
+#elif defined(__x86_64__)
+#error cpuid instruction not available
+#endif
+    return (exx[2] & (1 << 27)) != 0;       /* osxsave */
+}
+
+/*
+ * Does XGETBV say the YMM registers are enabled?
+ *
+ * NB: Caller is responsible for verifying that xsave_available() returns true
+ * before calling this.
+ */
+#ifdef HAVE_XSAVE_INTRINSICS
+pg_attribute_target("xsave")
+#endif
+static inline bool
+ymm_regs_available(void)
+{
+#ifdef HAVE_XSAVE_INTRINSICS
+    return (_xgetbv(0) & 0x06) == 0x06;
+#else
+    return false;
+#endif
+}
+
+/*
+ * Does CPUID say there's support for AVX-2
+ */
+static inline bool
+avx2_available(void)
+{
+#if defined (USE_AVX2_WITH_RUNTIME_CHECK) && defined(__x86_64__)
+    unsigned int exx[4] = {0, 0, 0, 0};
+    if (!xsave_available() || !ymm_regs_available()) return false;
+
+#if defined(HAVE__GET_CPUID_COUNT)
+    __get_cpuid_count(7, 0, &exx[0], &exx[1], &exx[2], &exx[3]);
+#elif defined(HAVE__CPUIDEX)
+    __cpuidex(exx, 7, 0);
+#else
+#error cpuid instruction not available
+#endif
+    return (exx[1] & (1 << 5)) != 0; /* avx2 */
+#else
+    return false;
+#endif
+}
+
+
+/* default checksum implementation */
+PG_DEFINE_CHECKSUM_ISA(default)
+{
+    uint32 sums[N_SUMS], result = 0;
+    uint32 i, j;
+
+    Assert(sizeof(PGChecksummablePage) == BLCKSZ);
+    memcpy(sums, checksumBaseOffsets, sizeof(checksumBaseOffsets));
+
+    for (i = 0; i < (uint32)(BLCKSZ / (sizeof(uint32) * N_SUMS)); i++)
+        for (j = 0; j < N_SUMS; j++)
+            CHECKSUM_COMP(sums[j], page->data[i][j]);
+
+    for (i = 0; i < 2; i++)
+        for (j = 0; j < N_SUMS; j++)
+            CHECKSUM_COMP(sums[j], 0);
+
+    for (i = 0; i < N_SUMS; i++)
+        result ^= sums[i];
+
+    return result;
+}
+
+#ifdef USE_AVX2_WITH_RUNTIME_CHECK
+PG_DEFINE_CHECKSUM_ISA(avx2)
+{
+    uint32 sums[N_SUMS], result = 0;
+    uint32 i, j;
+
+    Assert(sizeof(PGChecksummablePage) == BLCKSZ);
+    memcpy(sums, checksumBaseOffsets, sizeof(checksumBaseOffsets));
+
+    for (i = 0; i < (uint32)(BLCKSZ / (sizeof(uint32) * N_SUMS)); i++)
+        for (j = 0; j < N_SUMS; j++)
+            CHECKSUM_COMP(sums[j], page->data[i][j]);
+
+    for (i = 0; i < 2; i++)
+        for (j = 0; j < N_SUMS; j++)
+            CHECKSUM_COMP(sums[j], 0);
+
+    for (i = 0; i < N_SUMS; i++)
+        result ^= sums[i];
+
+    return result;
+}
+#endif
+
+/* Function pointer - external linkage (declared extern in header) */
+uint32 (*pg_checksum_block)(const PGChecksummablePage *page) = pg_checksum_block_dispatch;
+
+/* Dispatch function: simple, safe */
+uint32 pg_checksum_block_dispatch(const PGChecksummablePage *page)
+{
+#ifdef USE_AVX2_WITH_RUNTIME_CHECK
+    if (avx2_available())
+    {
+        /* optional: patch pointer so next call goes directly */
+        pg_checksum_block = pg_checksum_block_avx2;
+        return pg_checksum_block_avx2(page);
+    }
+#endif
+    /* fallback */
+    pg_checksum_block = pg_checksum_block_default;
+    return pg_checksum_block_default(page);
+}
+
+
+/* Compute checksum for a Postgres page */
+uint16 pg_checksum_page(char *page, BlockNumber blkno)
+{
+    PGChecksummablePage *cpage = (PGChecksummablePage *) page;
+    uint16 save_checksum;
+    uint32 checksum;
+
+    Assert(!PageIsNew((Page) page));
+
+    save_checksum = cpage->phdr.pd_checksum;
+    cpage->phdr.pd_checksum = 0;
+    checksum = pg_checksum_block(cpage);
+    cpage->phdr.pd_checksum = save_checksum;
+
+    checksum ^= blkno;
+    return (uint16)((checksum % 65535) + 1);
+}
-- 
2.43.0




[PATCH 2/2] Benchmark code for postgres checksums

От
tenistarkim@gmail.com
Дата:
From: Andrew kim <andrew.kim@intel.com>

---
 contrib/meson.build                           |  1 +
 contrib/pg_checksum_bench/meson.build         | 23 +++++++++++++
 .../pg_checksum_bench--1.0.sql                |  8 +++++
 contrib/pg_checksum_bench/pg_checksum_bench.c | 34 +++++++++++++++++++
 .../pg_checksum_bench.control                 |  4 +++
 .../sql/pg_checksum_bench.sql                 | 17 ++++++++++
 6 files changed, 87 insertions(+)
 create mode 100644 contrib/pg_checksum_bench/meson.build
 create mode 100644 contrib/pg_checksum_bench/pg_checksum_bench--1.0.sql
 create mode 100644 contrib/pg_checksum_bench/pg_checksum_bench.c
 create mode 100644 contrib/pg_checksum_bench/pg_checksum_bench.control
 create mode 100644 contrib/pg_checksum_bench/sql/pg_checksum_bench.sql

diff --git a/contrib/meson.build b/contrib/meson.build
index ed30ee7d639..fe5149aadff 100644
--- a/contrib/meson.build
+++ b/contrib/meson.build
@@ -12,6 +12,7 @@ contrib_doc_args = {
   'install_dir': contrib_doc_dir,
 }
 
+subdir('pg_checksum_bench')
 subdir('amcheck')
 subdir('auth_delay')
 subdir('auto_explain')
diff --git a/contrib/pg_checksum_bench/meson.build b/contrib/pg_checksum_bench/meson.build
new file mode 100644
index 00000000000..32ccd9efa0f
--- /dev/null
+++ b/contrib/pg_checksum_bench/meson.build
@@ -0,0 +1,23 @@
+# Copyright (c) 2022-2025, PostgreSQL Global Development Group
+
+pg_checksum_bench_sources = files(
+  'pg_checksum_bench.c',
+)
+
+if host_system == 'windows'
+  pg_checksum_bench_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+    '--NAME', 'pg_checksum_bench',
+    '--FILEDESC', 'pg_checksum_bench',])
+endif
+
+pg_checksum_bench = shared_module('pg_checksum_bench',
+  pg_checksum_bench_sources,
+  kwargs: contrib_mod_args,
+)
+contrib_targets += pg_checksum_bench
+
+install_data(
+  'pg_checksum_bench--1.0.sql',
+  'pg_checksum_bench.control',
+  kwargs: contrib_data_args,
+)
diff --git a/contrib/pg_checksum_bench/pg_checksum_bench--1.0.sql
b/contrib/pg_checksum_bench/pg_checksum_bench--1.0.sql
new file mode 100644
index 00000000000..5f13cbe3c5e
--- /dev/null
+++ b/contrib/pg_checksum_bench/pg_checksum_bench--1.0.sql
@@ -0,0 +1,8 @@
+/* contrib/pg_checksum_bench/pg_checksum_bench--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+-- \echo Use "CREATE EXTENSION pg_checksum_bench" to load this file. \quit
+
+CREATE FUNCTION drive_pg_checksum(page_count int)
+    RETURNS pg_catalog.void
+    AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/contrib/pg_checksum_bench/pg_checksum_bench.c b/contrib/pg_checksum_bench/pg_checksum_bench.c
new file mode 100644
index 00000000000..f40f335ff59
--- /dev/null
+++ b/contrib/pg_checksum_bench/pg_checksum_bench.c
@@ -0,0 +1,34 @@
+#include "postgres.h"
+#include "fmgr.h"
+#include "storage/checksum_impl.h"
+
+#include <stdio.h>
+#include <assert.h>
+
+PG_MODULE_MAGIC;
+
+#define REPEATS 1000000
+
+PG_FUNCTION_INFO_V1(drive_pg_checksum);
+Datum
+drive_pg_checksum(PG_FUNCTION_ARGS)
+{
+    int page_count = PG_GETARG_INT32(0);
+
+    PGChecksummablePage * pages = palloc(page_count * sizeof(PGChecksummablePage));
+    srand(0);
+    for (size_t i = 0; i < page_count * sizeof(PGChecksummablePage); i++){
+        char * byte_ptr = (char *) pages;
+        byte_ptr[i] = rand() % 256;
+    }
+
+    for (int i = 0; i < REPEATS; i++){
+        const PGChecksummablePage * test_page = pages + (i % page_count);
+        volatile uint32 result = pg_checksum_block(test_page);
+        (void) result;
+    }
+
+    pfree((void *) pages);
+
+    PG_RETURN_VOID();
+}
diff --git a/contrib/pg_checksum_bench/pg_checksum_bench.control b/contrib/pg_checksum_bench/pg_checksum_bench.control
new file mode 100644
index 00000000000..4a4e2c9363c
--- /dev/null
+++ b/contrib/pg_checksum_bench/pg_checksum_bench.control
@@ -0,0 +1,4 @@
+comment = 'pg_checksum benchmark'
+default_version = '1.0'
+module_pathname = '$libdir/pg_checksum_bench'
+relocatable = true
diff --git a/contrib/pg_checksum_bench/sql/pg_checksum_bench.sql b/contrib/pg_checksum_bench/sql/pg_checksum_bench.sql
new file mode 100644
index 00000000000..4b347699953
--- /dev/null
+++ b/contrib/pg_checksum_bench/sql/pg_checksum_bench.sql
@@ -0,0 +1,17 @@
+CREATE EXTENSION pg_checksum_bench;
+
+SELECT drive_pg_checksum(-1);
+
+\timing on
+
+SELECT drive_pg_checksum(1);
+SELECT drive_pg_checksum(2);
+SELECT drive_pg_checksum(4);
+SELECT drive_pg_checksum(8);
+SELECT drive_pg_checksum(16);
+SELECT drive_pg_checksum(32);
+SELECT drive_pg_checksum(64);
+SELECT drive_pg_checksum(128);
+SELECT drive_pg_checksum(256);
+SELECT drive_pg_checksum(512);
+SELECT drive_pg_checksum(1024);
-- 
2.43.0




Re: Proposal for enabling auto-vectorization for checksum calculations

От
tenistarkim@gmail.com
Дата:
Hi John,

Thanks for the feedback. This is v5 of the patchset, updated following your comments:

- Moved the function pointer definitions out of common headers and
  into src/port, consistent with existing practice.

Thanks again for the guidance.

Best regards,
Kim Andrew



[PATCH 1/2] Enable autovectorizing pg_checksum_block

От
tenistarkim@gmail.com
Дата:
From: Andrew Kim <andrew.kim@intel.com>

---
 config/c-compiler.m4                |  31 +++++
 configure                           |  52 +++++++++
 configure.ac                        |   9 ++
 meson.build                         |  28 +++++
 src/include/pg_config.h.in          |   3 +
 src/include/storage/checksum_impl.h |  90 +++-----------
 src/port/Makefile                   |   1 +
 src/port/meson.build                |   1 +
 src/port/pg_checksum_dispatch.c     | 174 ++++++++++++++++++++++++++++
 9 files changed, 318 insertions(+), 71 deletions(-)
 create mode 100644 src/port/pg_checksum_dispatch.c

diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index da40bd6a647..5eb3218deb5 100644
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -711,6 +711,37 @@ fi
 undefine([Ac_cachevar])dnl
 ])# PGAC_XSAVE_INTRINSICS
 
+# PGAC_AVX2_SUPPORT
+# -----------------------------
+# Check if the compiler supports AVX2 in attribute((target))
+# and using AVX2 intrinsics in those functions
+#
+# If the intrinsics are supported, sets pgac_avx2_support.
+AC_DEFUN([PGAC_AVX2_SUPPORT],
+[define([Ac_cachevar], [AS_TR_SH([pgac_cv_avx2_support])])dnl
+AC_CACHE_CHECK([for AVX2 support], [Ac_cachevar],
+[AC_LINK_IFELSE([AC_LANG_PROGRAM([#include <immintrin.h>
+    #include <stdint.h>
+    #if defined(__has_attribute) && __has_attribute (target)
+    __attribute__((target("avx2")))
+    #endif
+    static int avx2_test(void)
+    {
+      const char buf@<:@sizeof(__m256i)@:>@;
+      __m256i accum = _mm256_loadu_si256((const __m256i *) buf);
+      accum = _mm256_add_epi32(accum, accum);
+      int result = _mm256_extract_epi32(accum, 0);
+      return (int) result;
+    }],
+  [return avx2_test();])],
+  [Ac_cachevar=yes],
+  [Ac_cachevar=no])])
+if test x"$Ac_cachevar" = x"yes"; then
+  pgac_avx2_support=yes
+fi
+undefine([Ac_cachevar])dnl
+])# PGAC_AVX2_SUPPORT
+
 # PGAC_AVX512_POPCNT_INTRINSICS
 # -----------------------------
 # Check if the compiler supports the AVX-512 popcount instructions using the
diff --git a/configure b/configure
index 39c68161cec..54da05ac0db 100755
--- a/configure
+++ b/configure
@@ -17608,6 +17608,58 @@ $as_echo "#define HAVE_XSAVE_INTRINSICS 1" >>confdefs.h
 
 fi
 
+# Check for AVX2 target and intrinsic support
+#
+if test x"$host_cpu" = x"x86_64"; then
+  { $as_echo "$as_me:${as_lineno-$LINENO}: checking for AVX2 support" >&5
+$as_echo_n "checking for AVX2 support... " >&6; }
+if ${pgac_cv_avx2_support+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+#include <immintrin.h>
+    #include <stdint.h>
+    #if defined(__has_attribute) && __has_attribute (target)
+    __attribute__((target("avx2")))
+    #endif
+    static int avx2_test(void)
+    {
+      const char buf[sizeof(__m256i)];
+      __m256i accum = _mm256_loadu_si256((const __m256i *) buf);
+      accum = _mm256_add_epi32(accum, accum);
+      int result = _mm256_extract_epi32(accum, 0);
+      return (int) result;
+    }
+int
+main ()
+{
+return avx2_test();
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_link "$LINENO"; then :
+  pgac_cv_avx2_support=yes
+else
+  pgac_cv_avx2_support=no
+fi
+rm -f core conftest.err conftest.$ac_objext \
+    conftest$ac_exeext conftest.$ac_ext
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv_avx2_support" >&5
+$as_echo "$pgac_cv_avx2_support" >&6; }
+if test x"$pgac_cv_avx2_support" = x"yes"; then
+  pgac_avx2_support=yes
+fi
+
+  if test x"$pgac_avx2_support" = x"yes"; then
+
+$as_echo "#define USE_AVX2_WITH_RUNTIME_CHECK 1" >>confdefs.h
+
+  fi
+fi
+
 # Check for AVX-512 popcount intrinsics
 #
 if test x"$host_cpu" = x"x86_64"; then
diff --git a/configure.ac b/configure.ac
index 066e3976c0a..2c484a12671 100644
--- a/configure.ac
+++ b/configure.ac
@@ -2118,6 +2118,15 @@ if test x"$pgac_xsave_intrinsics" = x"yes"; then
   AC_DEFINE(HAVE_XSAVE_INTRINSICS, 1, [Define to 1 if you have XSAVE intrinsics.])
 fi
 
+# Check for AVX2 target and intrinsic support
+#
+if test x"$host_cpu" = x"x86_64"; then
+  PGAC_AVX2_SUPPORT()
+  if test x"$pgac_avx2_support" = x"yes"; then
+    AC_DEFINE(USE_AVX2_WITH_RUNTIME_CHECK, 1, [Define to 1 to use AVX2 instructions with a runtime check.])
+  fi
+fi
+
 # Check for AVX-512 popcount intrinsics
 #
 if test x"$host_cpu" = x"x86_64"; then
diff --git a/meson.build b/meson.build
index ab8101d67b2..ff42c41ca7e 100644
--- a/meson.build
+++ b/meson.build
@@ -2289,6 +2289,34 @@ int main(void)
 
 endif
 
+###############################################################
+# Check for the availability of AVX2 support
+###############################################################
+
+if host_cpu == 'x86_64'
+
+  prog = '''
+#include <immintrin.h>
+#include <stdint.h>
+#if defined(__has_attribute) && __has_attribute (target)
+__attribute__((target("avx2")))
+#endif
+int main(void)
+{
+  const char buf[sizeof(__m256i)];
+  __m256i accum = _mm256_loadu_si256((const __m256i *) buf);
+  accum = _mm256_add_epi32(accum, accum);
+  int result = _mm256_extract_epi32(accum, 0);
+  return (int) result;
+}
+'''
+
+  if cc.links(prog, name: 'AVX2 support', args: test_c_args)
+    cdata.set('USE_AVX2_WITH_RUNTIME_CHECK', 1)
+  endif
+
+endif
+
 
 ###############################################################
 # Check for the availability of AVX-512 popcount intrinsics.
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index c4dc5d72bdb..987f9b5c77c 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -675,6 +675,9 @@
 /* Define to 1 to use AVX-512 CRC algorithms with a runtime check. */
 #undef USE_AVX512_CRC32C_WITH_RUNTIME_CHECK
 
+/* Define to 1 to use AVX2 instructions with a runtime check. */
+#undef USE_AVX2_WITH_RUNTIME_CHECK
+
 /* Define to 1 to use AVX-512 popcount instructions with a runtime check. */
 #undef USE_AVX512_POPCNT_WITH_RUNTIME_CHECK
 
diff --git a/src/include/storage/checksum_impl.h b/src/include/storage/checksum_impl.h
index da87d61ba52..82e525529f4 100644
--- a/src/include/storage/checksum_impl.h
+++ b/src/include/storage/checksum_impl.h
@@ -101,12 +101,14 @@
  */
 
 #include "storage/bufpage.h"
+#include "pg_config.h"
 
 /* number of checksums to calculate in parallel */
 #define N_SUMS 32
 /* prime multiplier of FNV-1a hash */
 #define FNV_PRIME 16777619
 
+
 /* Use a union so that this code is valid under strict aliasing */
 typedef union
 {
@@ -142,74 +144,20 @@ do { \
  * Block checksum algorithm.  The page must be adequately aligned
  * (at least on 4-byte boundary).
  */
-static uint32
-pg_checksum_block(const PGChecksummablePage *page)
-{
-    uint32        sums[N_SUMS];
-    uint32        result = 0;
-    uint32        i,
-                j;
-
-    /* ensure that the size is compatible with the algorithm */
-    Assert(sizeof(PGChecksummablePage) == BLCKSZ);
-
-    /* initialize partial checksums to their corresponding offsets */
-    memcpy(sums, checksumBaseOffsets, sizeof(checksumBaseOffsets));
-
-    /* main checksum calculation */
-    for (i = 0; i < (uint32) (BLCKSZ / (sizeof(uint32) * N_SUMS)); i++)
-        for (j = 0; j < N_SUMS; j++)
-            CHECKSUM_COMP(sums[j], page->data[i][j]);
-
-    /* finally add in two rounds of zeroes for additional mixing */
-    for (i = 0; i < 2; i++)
-        for (j = 0; j < N_SUMS; j++)
-            CHECKSUM_COMP(sums[j], 0);
-
-    /* xor fold partial checksums together */
-    for (i = 0; i < N_SUMS; i++)
-        result ^= sums[i];
-
-    return result;
-}
-
-/*
- * Compute the checksum for a Postgres page.
- *
- * The page must be adequately aligned (at least on a 4-byte boundary).
- * Beware also that the checksum field of the page is transiently zeroed.
- *
- * The checksum includes the block number (to detect the case where a page is
- * somehow moved to a different location), the page header (excluding the
- * checksum itself), and the page data.
- */
-uint16
-pg_checksum_page(char *page, BlockNumber blkno)
-{
-    PGChecksummablePage *cpage = (PGChecksummablePage *) page;
-    uint16        save_checksum;
-    uint32        checksum;
-
-    /* We only calculate the checksum for properly-initialized pages */
-    Assert(!PageIsNew((Page) page));
-
-    /*
-     * Save pd_checksum and temporarily set it to zero, so that the checksum
-     * calculation isn't affected by the old checksum stored on the page.
-     * Restore it after, because actually updating the checksum is NOT part of
-     * the API of this function.
-     */
-    save_checksum = cpage->phdr.pd_checksum;
-    cpage->phdr.pd_checksum = 0;
-    checksum = pg_checksum_block(cpage);
-    cpage->phdr.pd_checksum = save_checksum;
-
-    /* Mix in the block number to detect transposed pages */
-    checksum ^= blkno;
-
-    /*
-     * Reduce to a uint16 (to fit in the pd_checksum field) with an offset of
-     * one. That avoids checksums of zero, which seems like a good idea.
-     */
-    return (uint16) ((checksum % 65535) + 1);
-}
+#define PG_DECLARE_CHECKSUM_ISA(ISANAME) \
+uint32 \
+pg_checksum_block_##ISANAME(const PGChecksummablePage *page);
+
+#define PG_DEFINE_CHECKSUM_ISA(ISANAME) \
+pg_attribute_target(#ISANAME) \
+uint32 pg_checksum_block_##ISANAME(const PGChecksummablePage *page) \
+
+/* Declare ISA implementations (declarations only in header) */
+#ifdef USE_AVX2_WITH_RUNTIME_CHECK
+PG_DECLARE_CHECKSUM_ISA(avx2);
+#endif
+PG_DECLARE_CHECKSUM_ISA(default);
+
+uint32 pg_checksum_block_dispatch(const PGChecksummablePage *page);
+extern uint32 (*pg_checksum_block)(const PGChecksummablePage *page);
+extern uint16 pg_checksum_page(char *page, BlockNumber blkno);
diff --git a/src/port/Makefile b/src/port/Makefile
index 4274949dfa4..27423f1058b 100644
--- a/src/port/Makefile
+++ b/src/port/Makefile
@@ -48,6 +48,7 @@ OBJS = \
     pg_numa.o \
     pg_popcount_aarch64.o \
     pg_popcount_avx512.o \
+    pg_checksum_dispatch.o \
     pg_strong_random.o \
     pgcheckdir.o \
     pgmkdirp.o \
diff --git a/src/port/meson.build b/src/port/meson.build
index fc7b059fee5..c4bbe9f2ece 100644
--- a/src/port/meson.build
+++ b/src/port/meson.build
@@ -11,6 +11,7 @@ pgport_sources = [
   'pg_numa.c',
   'pg_popcount_aarch64.c',
   'pg_popcount_avx512.c',
+  'pg_checksum_dispatch.c',
   'pg_strong_random.c',
   'pgcheckdir.c',
   'pgmkdirp.c',
diff --git a/src/port/pg_checksum_dispatch.c b/src/port/pg_checksum_dispatch.c
new file mode 100644
index 00000000000..15f7b8af34f
--- /dev/null
+++ b/src/port/pg_checksum_dispatch.c
@@ -0,0 +1,174 @@
+/*-------------------------------------------------------------------------
+ *
+ * pg_checksum_dispatch.c
+ *        Holds the AVX2 pg_popcount() implementation.
+ *
+ * Copyright (c) 2024-2025, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *        src/port/pg_checksum_dispatch.c
+ *
+ *-------------------------------------------------------------------------
+ */
+#include "c.h"
+#include "storage/checksum_impl.h"
+
+#if defined(HAVE__GET_CPUID) || defined(HAVE__GET_CPUID_COUNT)
+#include <cpuid.h>
+#endif
+
+#ifdef HAVE_XSAVE_INTRINSICS
+#include <immintrin.h>
+#endif
+
+#if defined(HAVE__CPUID) || defined(HAVE__CPUIDEX)
+#include <intrin.h>
+#endif
+
+#include "port/pg_bitutils.h"
+
+/*
+ * Does CPUID say there's support for XSAVE instructions?
+ */
+static inline bool
+xsave_available(void)
+{
+    unsigned int exx[4] = {0, 0, 0, 0};
+
+#if defined(HAVE__GET_CPUID)
+    __get_cpuid(1, &exx[0], &exx[1], &exx[2], &exx[3]);
+#elif defined(HAVE__CPUID)
+    __cpuid(exx, 1);
+#elif defined(__x86_64__)
+#error cpuid instruction not available
+#endif
+    return (exx[2] & (1 << 27)) != 0;       /* osxsave */
+}
+
+/*
+ * Does XGETBV say the YMM registers are enabled?
+ *
+ * NB: Caller is responsible for verifying that xsave_available() returns true
+ * before calling this.
+ */
+#ifdef HAVE_XSAVE_INTRINSICS
+pg_attribute_target("xsave")
+#endif
+static inline bool
+ymm_regs_available(void)
+{
+#ifdef HAVE_XSAVE_INTRINSICS
+    return (_xgetbv(0) & 0x06) == 0x06;
+#else
+    return false;
+#endif
+}
+
+/*
+ * Does CPUID say there's support for AVX-2
+ */
+static inline bool
+avx2_available(void)
+{
+#if defined (USE_AVX2_WITH_RUNTIME_CHECK) && defined(__x86_64__)
+    unsigned int exx[4] = {0, 0, 0, 0};
+    if (!xsave_available() || !ymm_regs_available()) return false;
+
+#if defined(HAVE__GET_CPUID_COUNT)
+    __get_cpuid_count(7, 0, &exx[0], &exx[1], &exx[2], &exx[3]);
+#elif defined(HAVE__CPUIDEX)
+    __cpuidex(exx, 7, 0);
+#else
+#error cpuid instruction not available
+#endif
+    return (exx[1] & (1 << 5)) != 0; /* avx2 */
+#else
+    return false;
+#endif
+}
+
+
+/* default checksum implementation */
+PG_DEFINE_CHECKSUM_ISA(default)
+{
+    uint32 sums[N_SUMS], result = 0;
+    uint32 i, j;
+
+    Assert(sizeof(PGChecksummablePage) == BLCKSZ);
+    memcpy(sums, checksumBaseOffsets, sizeof(checksumBaseOffsets));
+
+    for (i = 0; i < (uint32)(BLCKSZ / (sizeof(uint32) * N_SUMS)); i++)
+        for (j = 0; j < N_SUMS; j++)
+            CHECKSUM_COMP(sums[j], page->data[i][j]);
+
+    for (i = 0; i < 2; i++)
+        for (j = 0; j < N_SUMS; j++)
+            CHECKSUM_COMP(sums[j], 0);
+
+    for (i = 0; i < N_SUMS; i++)
+        result ^= sums[i];
+
+    return result;
+}
+
+#ifdef USE_AVX2_WITH_RUNTIME_CHECK
+PG_DEFINE_CHECKSUM_ISA(avx2)
+{
+    uint32 sums[N_SUMS], result = 0;
+    uint32 i, j;
+
+    Assert(sizeof(PGChecksummablePage) == BLCKSZ);
+    memcpy(sums, checksumBaseOffsets, sizeof(checksumBaseOffsets));
+
+    for (i = 0; i < (uint32)(BLCKSZ / (sizeof(uint32) * N_SUMS)); i++)
+        for (j = 0; j < N_SUMS; j++)
+            CHECKSUM_COMP(sums[j], page->data[i][j]);
+
+    for (i = 0; i < 2; i++)
+        for (j = 0; j < N_SUMS; j++)
+            CHECKSUM_COMP(sums[j], 0);
+
+    for (i = 0; i < N_SUMS; i++)
+        result ^= sums[i];
+
+    return result;
+}
+#endif
+
+/* Function pointer - external linkage (declared extern in header) */
+uint32 (*pg_checksum_block)(const PGChecksummablePage *page) = pg_checksum_block_dispatch;
+
+/* Dispatch function: simple, safe */
+uint32 pg_checksum_block_dispatch(const PGChecksummablePage *page)
+{
+#ifdef USE_AVX2_WITH_RUNTIME_CHECK
+    if (avx2_available())
+    {
+        /* optional: patch pointer so next call goes directly */
+        pg_checksum_block = pg_checksum_block_avx2;
+        return pg_checksum_block_avx2(page);
+    }
+#endif
+    /* fallback */
+    pg_checksum_block = pg_checksum_block_default;
+    return pg_checksum_block_default(page);
+}
+
+
+/* Compute checksum for a Postgres page */
+uint16 pg_checksum_page(char *page, BlockNumber blkno)
+{
+    PGChecksummablePage *cpage = (PGChecksummablePage *) page;
+    uint16 save_checksum;
+    uint32 checksum;
+
+    Assert(!PageIsNew((Page) page));
+
+    save_checksum = cpage->phdr.pd_checksum;
+    cpage->phdr.pd_checksum = 0;
+    checksum = pg_checksum_block(cpage);
+    cpage->phdr.pd_checksum = save_checksum;
+
+    checksum ^= blkno;
+    return (uint16)((checksum % 65535) + 1);
+}
-- 
2.43.0




[PATCH 2/2] Benchmark code for postgres checksums

От
tenistarkim@gmail.com
Дата:
From: Andrew kim <andrew.kim@intel.com>

---
 contrib/meson.build                           |  1 +
 contrib/pg_checksum_bench/meson.build         | 23 +++++++++++++
 .../pg_checksum_bench--1.0.sql                |  8 +++++
 contrib/pg_checksum_bench/pg_checksum_bench.c | 34 +++++++++++++++++++
 .../pg_checksum_bench.control                 |  4 +++
 .../sql/pg_checksum_bench.sql                 | 17 ++++++++++
 6 files changed, 87 insertions(+)
 create mode 100644 contrib/pg_checksum_bench/meson.build
 create mode 100644 contrib/pg_checksum_bench/pg_checksum_bench--1.0.sql
 create mode 100644 contrib/pg_checksum_bench/pg_checksum_bench.c
 create mode 100644 contrib/pg_checksum_bench/pg_checksum_bench.control
 create mode 100644 contrib/pg_checksum_bench/sql/pg_checksum_bench.sql

diff --git a/contrib/meson.build b/contrib/meson.build
index ed30ee7d639..fe5149aadff 100644
--- a/contrib/meson.build
+++ b/contrib/meson.build
@@ -12,6 +12,7 @@ contrib_doc_args = {
   'install_dir': contrib_doc_dir,
 }
 
+subdir('pg_checksum_bench')
 subdir('amcheck')
 subdir('auth_delay')
 subdir('auto_explain')
diff --git a/contrib/pg_checksum_bench/meson.build b/contrib/pg_checksum_bench/meson.build
new file mode 100644
index 00000000000..32ccd9efa0f
--- /dev/null
+++ b/contrib/pg_checksum_bench/meson.build
@@ -0,0 +1,23 @@
+# Copyright (c) 2022-2025, PostgreSQL Global Development Group
+
+pg_checksum_bench_sources = files(
+  'pg_checksum_bench.c',
+)
+
+if host_system == 'windows'
+  pg_checksum_bench_sources += rc_lib_gen.process(win32ver_rc, extra_args: [
+    '--NAME', 'pg_checksum_bench',
+    '--FILEDESC', 'pg_checksum_bench',])
+endif
+
+pg_checksum_bench = shared_module('pg_checksum_bench',
+  pg_checksum_bench_sources,
+  kwargs: contrib_mod_args,
+)
+contrib_targets += pg_checksum_bench
+
+install_data(
+  'pg_checksum_bench--1.0.sql',
+  'pg_checksum_bench.control',
+  kwargs: contrib_data_args,
+)
diff --git a/contrib/pg_checksum_bench/pg_checksum_bench--1.0.sql
b/contrib/pg_checksum_bench/pg_checksum_bench--1.0.sql
new file mode 100644
index 00000000000..5f13cbe3c5e
--- /dev/null
+++ b/contrib/pg_checksum_bench/pg_checksum_bench--1.0.sql
@@ -0,0 +1,8 @@
+/* contrib/pg_checksum_bench/pg_checksum_bench--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+-- \echo Use "CREATE EXTENSION pg_checksum_bench" to load this file. \quit
+
+CREATE FUNCTION drive_pg_checksum(page_count int)
+    RETURNS pg_catalog.void
+    AS 'MODULE_PATHNAME' LANGUAGE C;
diff --git a/contrib/pg_checksum_bench/pg_checksum_bench.c b/contrib/pg_checksum_bench/pg_checksum_bench.c
new file mode 100644
index 00000000000..f40f335ff59
--- /dev/null
+++ b/contrib/pg_checksum_bench/pg_checksum_bench.c
@@ -0,0 +1,34 @@
+#include "postgres.h"
+#include "fmgr.h"
+#include "storage/checksum_impl.h"
+
+#include <stdio.h>
+#include <assert.h>
+
+PG_MODULE_MAGIC;
+
+#define REPEATS 1000000
+
+PG_FUNCTION_INFO_V1(drive_pg_checksum);
+Datum
+drive_pg_checksum(PG_FUNCTION_ARGS)
+{
+    int page_count = PG_GETARG_INT32(0);
+
+    PGChecksummablePage * pages = palloc(page_count * sizeof(PGChecksummablePage));
+    srand(0);
+    for (size_t i = 0; i < page_count * sizeof(PGChecksummablePage); i++){
+        char * byte_ptr = (char *) pages;
+        byte_ptr[i] = rand() % 256;
+    }
+
+    for (int i = 0; i < REPEATS; i++){
+        const PGChecksummablePage * test_page = pages + (i % page_count);
+        volatile uint32 result = pg_checksum_block(test_page);
+        (void) result;
+    }
+
+    pfree((void *) pages);
+
+    PG_RETURN_VOID();
+}
diff --git a/contrib/pg_checksum_bench/pg_checksum_bench.control b/contrib/pg_checksum_bench/pg_checksum_bench.control
new file mode 100644
index 00000000000..4a4e2c9363c
--- /dev/null
+++ b/contrib/pg_checksum_bench/pg_checksum_bench.control
@@ -0,0 +1,4 @@
+comment = 'pg_checksum benchmark'
+default_version = '1.0'
+module_pathname = '$libdir/pg_checksum_bench'
+relocatable = true
diff --git a/contrib/pg_checksum_bench/sql/pg_checksum_bench.sql b/contrib/pg_checksum_bench/sql/pg_checksum_bench.sql
new file mode 100644
index 00000000000..4b347699953
--- /dev/null
+++ b/contrib/pg_checksum_bench/sql/pg_checksum_bench.sql
@@ -0,0 +1,17 @@
+CREATE EXTENSION pg_checksum_bench;
+
+SELECT drive_pg_checksum(-1);
+
+\timing on
+
+SELECT drive_pg_checksum(1);
+SELECT drive_pg_checksum(2);
+SELECT drive_pg_checksum(4);
+SELECT drive_pg_checksum(8);
+SELECT drive_pg_checksum(16);
+SELECT drive_pg_checksum(32);
+SELECT drive_pg_checksum(64);
+SELECT drive_pg_checksum(128);
+SELECT drive_pg_checksum(256);
+SELECT drive_pg_checksum(512);
+SELECT drive_pg_checksum(1024);
-- 
2.43.0




Re: Proposal for enabling auto-vectorization for checksum calculations

От
Andrew Kim
Дата:
> I don't know if this is related to the crashes, but it doesn't seem
> like a good idea to #include the function pointer stuff everywhere,
> that should probably go into src/port like the others.
Just a gentle reminder on this patch series — I’ve already rebased it on
the latest tip of master and addressed the earlier review comments:

  * Moved the function pointer definitions into src/port as suggested.
  * Rebased cleanly on the current master branch.

Could someone take another look and share any further feedback?

Thanks a lot for your time and review,
Andrew Kim

Re: Proposal for enabling auto-vectorization for checksum calculations

От
John Naylor
Дата:
On Thu, Sep 11, 2025 at 1:55 PM root <tenistarkim@gmail.com> wrote:
> Thanks for the feedback. This is v5 of the patchset, updated following your comments:
>
> - Moved the function pointer definitions out of common headers and
>   into src/port, consistent with existing practice.

There is no attachment in this thread, so it's not showing up in the
commitfest entry (which will need to be moved to next open
commitfest), so it's not getting CI testing:

https://commitfest.postgresql.org/patch/5726/

Note that the whole series must be attached in a single email, or it
won't get automated testing.

--
John Naylor
Amazon Web Services



Re: Proposal for enabling auto-vectorization for checksum calculations

От
Andrew Kim
Дата:
On Tue, Sep 23, 2025 at 11:32 PM John Naylor johncnaylorls@gmail.com wrote:

There is no attachment in this thread, so it's not showing up in the
commitfest entry (which will need to be moved to next open
commitfest), so it's not getting CI testing:

https://commitfest.postgresql.org/patch/5726/

Note that the whole series must be attached in a single email, or it
won't get automated testing.

Thanks, John. I see the issue now — I’ll attach the entire patch
series in a single email so it shows up properly in the commitfest and
gets CI coverage.

Please find attached v6 of the patchset, updated per your feedback.

Best regards,
Andrew Kim


On Tue, Sep 23, 2025 at 11:32 PM John Naylor <johncnaylorls@gmail.com> wrote:
>
> On Thu, Sep 11, 2025 at 1:55 PM root <tenistarkim@gmail.com> wrote:
> > Thanks for the feedback. This is v5 of the patchset, updated following your comments:
> >
> > - Moved the function pointer definitions out of common headers and
> >   into src/port, consistent with existing practice.
>
> There is no attachment in this thread, so it's not showing up in the
> commitfest entry (which will need to be moved to next open
> commitfest), so it's not getting CI testing:
>
> https://commitfest.postgresql.org/patch/5726/
>
> Note that the whole series must be attached in a single email, or it
> won't get automated testing.
>
> --
> John Naylor
> Amazon Web Services

Вложения

Re: Proposal for enabling auto-vectorization for checksum calculations

От
John Naylor
Дата:
On Thu, Sep 25, 2025 at 4:50 AM Andrew Kim <tenistarkim@gmail.com> wrote:
>
> Thanks, John. I see the issue now — I’ll attach the entire patch
> series in a single email so it shows up properly in the commitfest and
> gets CI coverage.

It's still picking up v4, and the archive link doesn't show any
further replies. Something must have happened with the email
threading, since you weren't on the thread at first.  Please create an
account and edit the entry to point to a more recent message ID:

https://commitfest.postgresql.org/patch/5726/

> Please find attached v6 of the patchset, updated per your feedback.

Thanks. (BTW, we discourage top-posting and prefer to cut to size and
use inline responses)

This is not a complete review, but some architectural thoughts and
some things I've noticed.

The top of the checksum_impl.h has this:

 * This file exists for the benefit of external programs that may wish to
 * check Postgres page checksums.  They can #include this to get the code
 * referenced by storage/checksum.h.  (Note: you may need to redefine
 * Assert() as empty to compile this successfully externally.)

It's going to be a bit tricky to preserve this ability while allowing
the core server and client programs to dispatch to a specialized
implementation, but we should at least try. That means keeping
pg_checksum_block() and pg_checksum_page() where they live now.

I think a good first refactoring patch would be to move
src/backend/storage/checksum.c (which your patch doesn't even touch)
to src/port (and src/include/storage/checksum.h to src/include/port)
and have all callers use that. With that, I imagine only that
checksum.c file would include checksum_impl.h.

If that poses a problem, let us know -- we may have to further juggle
things. If that works without issue, we can proceed with the
specialization. On that, just a few things to note here, although the
next patch doesn't need to worry about any of this yet:

+    #if defined(__has_attribute) && __has_attribute (target)
+    __attribute__((target("avx2")))
+    #endif
+    static int avx2_test(void)
+    {
+      const char buf@<:@sizeof(__m256i)@:>@;
+      __m256i accum = _mm256_loadu_si256((const __m256i *) buf);
+   accum = _mm256_add_epi32(accum, accum);
+      int result = _mm256_extract_epi32(accum, 0);
+      return (int) result;
+    }],

If we're just testing if the target works, we can just use an empty
function, right?

+#define PG_DECLARE_CHECKSUM_ISA(ISANAME) \
+uint32 \
+pg_checksum_block_##ISANAME(const PGChecksummablePage *page);
+
+#define PG_DEFINE_CHECKSUM_ISA(ISANAME) \
+pg_attribute_target(#ISANAME) \
+uint32 pg_checksum_block_##ISANAME(const PGChecksummablePage *page) \

I find this hard to read compared to just using the actual name.

+avx2_available(void)
+{
+#if defined (USE_AVX2_WITH_RUNTIME_CHECK) && defined(__x86_64__)

Why guard on __x86_64__?

+PG_DEFINE_CHECKSUM_ISA(default)
+{
+ uint32 sums[N_SUMS], result = 0;
+ uint32 i, j;
[...]

+#ifdef USE_AVX2_WITH_RUNTIME_CHECK
+PG_DEFINE_CHECKSUM_ISA(avx2)
+{
+ uint32 sums[N_SUMS], result = 0;
+ uint32 i, j;
[...]

With the single src/port file idea above, these would just do "return
pg_checksum_block()" (or pg_checksum_page, whichever makes more
sense).

+ if (avx2_available())
+ {
+ /* optional: patch pointer so next call goes directly */
+ pg_checksum_block = pg_checksum_block_avx2;
+ return pg_checksum_block_avx2(page);
+ }

Not sure what your referring to here by "patching" the pointer, but it
sounds dangerous. Besides, the cost of indirection is basically zero
for multi-kilobyte inputs, so there is not even any motivation to
consider doing differently.

--
John Naylor
Amazon Web Services