Обсуждение: [PATCH] Add zstd compression for TOAST using extended header format
Hello PG Hackers,
Want to submit a patch that implements zstd compression for TOAST data using a 20-byte TOAST pointer format, directly addressing the concerns raised in prior discussions [1][2][3].
A bit of a background in the 2022 thread [3], The overall suggestion was to have something extensible for the TOAST header
i.e. something like:
00 = PGLZ
01 = LZ4
10 = reserved for future emergencies
11 = extended header with additional type byte
This patch implements that idea.
The new header format:
struct varatt_external_extended {
int32 va_rawsize; /* same as legacy */
uint32 va_extinfo; /* cmid=3 signals extended format */
uint8 va_flags; /* feature flags */
uint8 va_data[3]; /* va_data[0] = compression method */
Oid va_valueid; /* same as legacy */
Oid va_toastrelid; /* same as legacy */
};
A few notes:
- Zstd only applies to external TOAST, not inline compression. The 2-bit limit in va_tcinfo stays as-is for inline data, where pglz/lz4 work fine anyway. Zstd's wins show up on larger values.
- A GUC use_extended_toast_header controls whether pglz/lz4 also use the 20-byte format (defaults to off for compatibility, can enable it if you want consistency).
- Legacy 16-byte pointers continue to work - we check the vartag to determine which format to read.
The 4 extra bytes per pointer is negligible for typical TOAST data sizes, and it gives us room to grow.
- Zstd only applies to external TOAST, not inline compression. The 2-bit limit in va_tcinfo stays as-is for inline data, where pglz/lz4 work fine anyway. Zstd's wins show up on larger values.
- A GUC use_extended_toast_header controls whether pglz/lz4 also use the 20-byte format (defaults to off for compatibility, can enable it if you want consistency).
- Legacy 16-byte pointers continue to work - we check the vartag to determine which format to read.
The 4 extra bytes per pointer is negligible for typical TOAST data sizes, and it gives us room to grow.
Regards,
Dharin
Dharin
Вложения
Hello,
You may want to consider sending the patch to the pgsql-hackers mailing list.
On 16 Dec 2025, at 12:46 AM, Dharin Shah <dharinshah95@gmail.com> wrote:<zstd-toast-compression-external.patch>Hello PG Hackers,
Want to submit a patch that implements zstd compression for TOAST data using a 20-byte TOAST pointer format, directly addressing the concerns raised in prior discussions [1][2][3].
A bit of a background in the 2022 thread [3], The overall suggestion was to have something extensible for the TOAST header
i.e. something like:
00 = PGLZ
01 = LZ4
10 = reserved for future emergencies
11 = extended header with additional type byte
This patch implements that idea.
The new header format:
struct varatt_external_extended {
int32 va_rawsize; /* same as legacy */
uint32 va_extinfo; /* cmid=3 signals extended format */
uint8 va_flags; /* feature flags */
uint8 va_data[3]; /* va_data[0] = compression method */
Oid va_valueid; /* same as legacy */
Oid va_toastrelid; /* same as legacy */
};A few notes:
- Zstd only applies to external TOAST, not inline compression. The 2-bit limit in va_tcinfo stays as-is for inline data, where pglz/lz4 work fine anyway. Zstd's wins show up on larger values.
- A GUC use_extended_toast_header controls whether pglz/lz4 also use the 20-byte format (defaults to off for compatibility, can enable it if you want consistency).
- Legacy 16-byte pointers continue to work - we check the vartag to determine which format to read.
The 4 extra bytes per pointer is negligible for typical TOAST data sizes, and it gives us room to grow.Regards,
Dharin
THanks Murtuza,
My bad, wrong email :(
Regards,
My bad, wrong email :(
Regards,
Dharin
On Tue, Dec 16, 2025 at 6:56 AM Murtuza Zabuawala <murtuza.zabuawala@enterprisedb.com> wrote:
Hello,You may want to consider sending the patch to the pgsql-hackers mailing list.On 16 Dec 2025, at 12:46 AM, Dharin Shah <dharinshah95@gmail.com> wrote:<zstd-toast-compression-external.patch>Hello PG Hackers,
Want to submit a patch that implements zstd compression for TOAST data using a 20-byte TOAST pointer format, directly addressing the concerns raised in prior discussions [1][2][3].
A bit of a background in the 2022 thread [3], The overall suggestion was to have something extensible for the TOAST header
i.e. something like:
00 = PGLZ
01 = LZ4
10 = reserved for future emergencies
11 = extended header with additional type byte
This patch implements that idea.
The new header format:
struct varatt_external_extended {
int32 va_rawsize; /* same as legacy */
uint32 va_extinfo; /* cmid=3 signals extended format */
uint8 va_flags; /* feature flags */
uint8 va_data[3]; /* va_data[0] = compression method */
Oid va_valueid; /* same as legacy */
Oid va_toastrelid; /* same as legacy */
};A few notes:
- Zstd only applies to external TOAST, not inline compression. The 2-bit limit in va_tcinfo stays as-is for inline data, where pglz/lz4 work fine anyway. Zstd's wins show up on larger values.
- A GUC use_extended_toast_header controls whether pglz/lz4 also use the 20-byte format (defaults to off for compatibility, can enable it if you want consistency).
- Legacy 16-byte pointers continue to work - we check the vartag to determine which format to read.
The 4 extra bytes per pointer is negligible for typical TOAST data sizes, and it gives us room to grow.Regards,
Dharin