Обсуждение: [Bug] Usage of stale dead_items pointer in parallel vacuum
Hello Developers,
We have discovered a bug in the dead_items_reset function in PG v17 specifically near the parallel_vacuum_reset_dead_items function() invocation. The bug is still prevalent in the latest version v18 as well. Although it was introduced in v17, it was detected while running the vacuum.sql regression tests which were introduced in PG v17.5.
In dead_items_reset function, we destroy and recreate the shared TidStore by invoking the parallel_vacuum_reset_dead_items() function. However, after the reallocation, the vacrel->dead_items pointer is not updated to point to the newly allocated TidStore. As a result, the leader process continues to hold a pointer to the destroyed TidStore. As a result, subsequent calls by the leader process to dead_items_add() go through this stale pointer, resulting in a use-after-free.
However this goes unnoticed since the freeing and reallocation of the shared TidStore (vacrel->pvs->dead_items) are so close to each other, that the AllocSet memory context always hands back the exact same chunk of memory that was freed previously, making the stale vacrel->dead_items pointer still appear valid even though it technically points to freed storage.
This issue was reproduced while running the existing regression tests from vacuum.sql with our custom malloc allocator implementation. For your reference, we have previously shared our custom malloc allocator implementation in a separate bug fix. (message ID: 1936493cc38.68cb2ef27266.7456585136086197135@zohocorp.com).The corresponding crash backtrace is attached as well.
Failed regression:
Failed regression:
CREATE TABLE pvactst2 (i INT) WITH (autovacuum_enabled = off);
INSERT INTO pvactst2 SELECT generate_series(1, 1000);
CREATE INDEX ON pvactst2 (i);
CREATE INDEX ON pvactst2 (i);
SET min_parallel_index_scan_size to 0;
SET maintenance_work_mem TO 64;
SET min_parallel_index_scan_size to 0;
SET maintenance_work_mem TO 64;
INSERT INTO pvactst2 SELECT generate_series(1, 1000);
DELETE FROM pvactst2 WHERE i < 1000;
VACUUM (PARALLEL 2) pvactst2;
Please let me know if you have any questions or would like further details.
Thanks & Regards,
Kevin Oommen Anish
Member Technical Staff
ZOHO Corporation
Вложения
On Thu, Oct 2, 2025 at 12:56 AM Kevin Oommen Anish <kevin.o@zohocorp.com> wrote: > Failed regression: > CREATE TABLE pvactst2 (i INT) WITH (autovacuum_enabled = off); > INSERT INTO pvactst2 SELECT generate_series(1, 1000); > CREATE INDEX ON pvactst2 (i); > CREATE INDEX ON pvactst2 (i); > SET min_parallel_index_scan_size to 0; > SET maintenance_work_mem TO 64; > INSERT INTO pvactst2 SELECT generate_series(1, 1000); > DELETE FROM pvactst2 WHERE i < 1000; > VACUUM (PARALLEL 2) pvactst2; I can reproduce the issue and confirm that your patch fixes it. I didn't use your custom malloc allocator but instead applied a redundant palloc0 for TidStore in TidStoreCreateShared(), hoping to get a different chunk of memory (haha). @@ -212,6 +212,7 @@ TidStoreCreateShared(size_t max_bytes, int tranche_id) size_t dsa_init_size = DSA_DEFAULT_INIT_SEGMENT_SIZE; size_t dsa_max_size = DSA_MAX_SEGMENT_SIZE; + ts = palloc0(sizeof(TidStore)); ts = palloc0(sizeof(TidStore)); - Richard
On Thu, Oct 2, 2025 at 11:17 AM Richard Guo <guofenglinux@gmail.com> wrote: > I can reproduce the issue and confirm that your patch fixes it. I > didn't use your custom malloc allocator but instead applied a > redundant palloc0 for TidStore in TidStoreCreateShared(), hoping to > get a different chunk of memory (haha). I'm starting to wonder if we should have something similar to that allocator in core, which is actually here and not in the link upthread: https://www.postgresql.org/message-id/193261e2c4d.3dd3cd7c1842.871636075166132237@zohocorp.com -- John Naylor Amazon Web Services
On Thu, Oct 2, 2025 at 2:09 PM John Naylor <johncnaylorls@gmail.com> wrote: > On Thu, Oct 2, 2025 at 11:17 AM Richard Guo <guofenglinux@gmail.com> wrote: > > I can reproduce the issue and confirm that your patch fixes it. I > > didn't use your custom malloc allocator but instead applied a > > redundant palloc0 for TidStore in TidStoreCreateShared(), hoping to > > get a different chunk of memory (haha). > I'm starting to wonder if we should have something similar to that > allocator in core, which is actually here and not in the link > upthread: +1. This issue is more like a reuse-after-free rather than a use-after-free, and it doesn't seem easy to detect with current tools. This kind of allocator seems to be quite useful -- it has helped catch at least two bugs in our code so far. - Richard
On Wed, Oct 1, 2025 at 10:56 PM Kevin Oommen Anish <kevin.o@zohocorp.com> wrote: > We have discovered a bug in the dead_items_reset function in PG v17 specifically near the parallel_vacuum_reset_dead_itemsfunction() invocation. The bug is still prevalent in the latest version v18 as well. Althoughit was introduced in v17, it was detected while running the vacuum.sql regression tests which were introduced inPG v17.5. I've pushed your fix, thanks for the report! -- John Naylor Amazon Web Services