Обсуждение: Segfault in RI UPDATE CASCADE on partitioned tables with LIKE+ATTACH child (attnum drift)
Hello,We’re seeing a backend segfault when an ON UPDATE CASCADE fires across partitions, if the destination partition was created via CREATE TABLE … LIKE + ATTACH (so its physical tuple descriptor differs from the parent due to dropped-column tombstones/attnum drift). Names/types match by inspection, but the crash occurs during tuple materialization in the RI trigger execution.
Minimal Reproducer (self-contained)
DROP SCHEMA IF EXISTS t CASCADE;
CREATE SCHEMA t;
-- Pipelines (partitioned)
CREATE TABLE t.pipelines (
partition_id int NOT NULL,
id bigint NOT NULL,
PRIMARY KEY (partition_id, id)
) PARTITION BY LIST (partition_id);
CREATE TABLE t.pipelines_102 PARTITION OF t.pipelines FOR VALUES IN (102);
CREATE TABLE t.pipelines_50 PARTITION OF t.pipelines FOR VALUES IN (50);
-- Stages (partitioned) with ON UPDATE CASCADE to pipelines.
-- Create a mid column and drop it to leave a tombstone gap in attnums.
CREATE TABLE t.stages (
partition_id int NOT NULL,
id bigint NOT NULL,
tmp_mid int, -- dropped below, leaves parent attnum gap
pipeline_id bigint NOT NULL,
name text,
status int,
PRIMARY KEY (partition_id, id),
FOREIGN KEY (partition_id, pipeline_id)
REFERENCES t.pipelines(partition_id, id)
ON UPDATE CASCADE ON DELETE CASCADE
) PARTITION BY LIST (partition_id);
CREATE TABLE t.stages_102 PARTITION OF t.stages FOR VALUES IN (102);
ALTER TABLE t.stages DROP COLUMN tmp_mid;
-- Miscreate destination stage partition via LIKE + ATTACH (no tombstone, different attnums).
CREATE TABLE t.stages_50_like (LIKE t.stages INCLUDING DEFAULTS);
ALTER TABLE t.stages ATTACH PARTITION t.stages_50_like FOR VALUES IN (50);
-- Builds (partitioned), cascades to both stages and pipelines.
CREATE TABLE t.builds (
partition_id int NOT NULL,
id bigint NOT NULL,
stage_id bigint NOT NULL,
commit_id bigint NOT NULL,
PRIMARY KEY (partition_id, id),
FOREIGN KEY (partition_id, stage_id)
REFERENCES t.stages(partition_id, id)
ON UPDATE CASCADE ON DELETE CASCADE,
FOREIGN KEY (partition_id, commit_id)
REFERENCES t.pipelines(partition_id, id)
ON UPDATE CASCADE ON DELETE CASCADE
) PARTITION BY LIST (partition_id);
CREATE TABLE t.builds_102 PARTITION OF t.builds FOR VALUES IN (102);
CREATE TABLE t.builds_50 PARTITION OF t.builds FOR VALUES IN (50);
-- Seed rows in source partition 102.
INSERT INTO t.pipelines_102(partition_id, id) VALUES (102, 1);
INSERT INTO t.stages_102 (partition_id, id, pipeline_id, name, status)
VALUES (102, 10, 1, 's', 0);
INSERT INTO t.builds_102 (partition_id, id, stage_id, commit_id)
VALUES (102, 100, 10, 1);
-- Crash repro: cascaded UPDATE across partitions
UPDATE t.pipelines
SET partition_id = 50
WHERE partition_id = 102 AND id = 1;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Succeeded.
Postgres logs:
2025-10-16 09:23:08.535 UTC [18630] LOG: client backend (PID 18673) was terminated by signal 11: Segmentation fault
2025-10-16 09:23:08.535 UTC [18630] DETAIL: Failed process was running: UPDATE t.pipelines
SET partition_id = 50
WHERE partition_id = 102 AND id = 1;
Environment
- PostgreSQL:
postgres=# SHOW server_version;
server_version
----------------
18.0
(1 row)
postgres=# SHOW server_version_num;
server_version_num
--------------------
180000
(1 row)
postgres=# SELECT version();
version
----------------------------------------------------------------------------------------------------------
PostgreSQL 18.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.5.0 20240719 (Red Hat 11.5.0-5), 64-bit
(1 row)
postgres=# SHOW shared_preload_libraries;
shared_preload_libraries
--------------------------
(1 row)
postgres=# SELECT extname, extversion FROM pg_extension ORDER BY 1;
extname | extversion
---------+------------
plpgsql | 1.0
(1 row)
- OS/Kernel/Libc:
[root@postgres-source ~]# uname -a
Linux postgres-source 5.14.0-427.22.1.el9_4.x86_64 #1 SMP PREEMPT_DYNAMIC Wed Jun 19 04:14:38 PDT 2024 x86_64 x86_64 x86_64 GNU/Linux
[root@postgres-source ~]# cat /etc/os-release
NAME="Oracle Linux Server"
VERSION="9.3"
ID="ol"
ID_LIKE="fedora"
VARIANT="Server"
VARIANT_ID="server"
VERSION_ID="9.3"
PLATFORM_ID="platform:el9"
PRETTY_NAME="Oracle Linux Server 9.3"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:oracle:linux:9:3:server"
HOME_URL="https://linux.oracle.com/"
BUG_REPORT_URL="https://github.com/oracle/oracle-linux"
ORACLE_BUGZILLA_PRODUCT="Oracle Linux 9"
ORACLE_BUGZILLA_PRODUCT_VERSION=9.3
ORACLE_SUPPORT_PRODUCT="Oracle Linux"
ORACLE_SUPPORT_PRODUCT_VERSION=9.3
[root@postgres-source ~]# ldd --version
ldd (GNU libc) 2.34
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
Written by Roland McGrath and Ulrich Drepper.
Backrace in attachment
Issue is reproducible at least in Postgres 16, 17, 18
Please let me know if I need to provide some other information
—
BR Dmitry
Вложения
On Fri, 17 Oct 2025 at 10:53, Dmitry Fomin <fomin.list@gmail.com> wrote: > -- Crash repro: cascaded UPDATE across partitions > UPDATE t.pipelines > SET partition_id = 50 > WHERE partition_id = 102 AND id = 1; > > > server closed the connection unexpectedly Thanks for the detailed report. It seems to have been caused by ba9a7e3921. For some reason the ResultRelInfo.ri_RootResultRelInfo isn't set for this partition which causes ExecGetChildToRootMap() to think no translation is required. More to come... David
On Fri, 17 Oct 2025 at 14:21, David Rowley <dgrowleyml@gmail.com> wrote: > Thanks for the detailed report. It seems to have been caused by > ba9a7e3921. For some reason the ResultRelInfo.ri_RootResultRelInfo > isn't set for this partition which causes ExecGetChildToRootMap() to > think no translation is required. The problem is with the ResultRelInfo caching that's in ExecGetTriggerResultRel(). The code there tries looking into the estate's es_opened_result_relations, es_tuple_routing_result_relations and es_trig_target_relations Lists to see if the ResultRelInfo was created before. In the problem case the ResultRelInfo is found in es_trig_target_relations and unfortunately it's been set up by some code in afterTriggerInvokeEvents() which passes a NULL rootRelInfo. This means when ExecGetTriggerResultRel() is called again this time passing the correct rootRelInfo, the cached one that has the NULL ri_RootResultRelInfo is found and returned. This results in the ExecGetChildToRootMap() code doing the wrong thing because it sees a NULL ri_RootResultRelInfo therefore does not translate the slot into the slow format of the partitioned table. I've attached a patch which fixes the problem. I'm just not sure if it's the right fix for the problem. I suspect the real problem is down to the fact that ExecGetTriggerResultRel() passes a NULL rootRelInfo in the first place. I just don't see a good way to figure out what the parent table should be so we know to create a parent ResultRelInfo as the trigger that is firing is for the partition, not the partitioned table. I don't see any way to figure out that the trigger is being fired because it's cascading an update of its parent partitioned table... At a guess, it feels like there might be some fields missing in AfterTriggerShared to figure this out. Any thoughts on this Amit? David
Вложения
On Fri, Oct 17, 2025 at 6:08 PM David Rowley <dgrowleyml@gmail.com> wrote: > On Fri, 17 Oct 2025 at 14:21, David Rowley <dgrowleyml@gmail.com> wrote: > > Thanks for the detailed report. It seems to have been caused by > > ba9a7e3921. For some reason the ResultRelInfo.ri_RootResultRelInfo > > isn't set for this partition which causes ExecGetChildToRootMap() to > > think no translation is required. > > The problem is with the ResultRelInfo caching that's in > ExecGetTriggerResultRel(). The code there tries looking into the > estate's es_opened_result_relations, es_tuple_routing_result_relations > and es_trig_target_relations Lists to see if the ResultRelInfo was > created before. In the problem case the ResultRelInfo is found in > es_trig_target_relations and unfortunately it's been set up by some > code in afterTriggerInvokeEvents() which passes a NULL rootRelInfo. > This means when ExecGetTriggerResultRel() is called again this time > passing the correct rootRelInfo, the cached one that has the NULL > ri_RootResultRelInfo is found and returned. > > This results in the ExecGetChildToRootMap() code doing the wrong thing > because it sees a NULL ri_RootResultRelInfo therefore does not > translate the slot into the slow format of the partitioned table. > > I've attached a patch which fixes the problem. I'm just not sure if > it's the right fix for the problem. I suspect the real problem is down > to the fact that ExecGetTriggerResultRel() passes a NULL rootRelInfo > in the first place. I just don't see a good way to figure out what the > parent table should be so we know to create a parent ResultRelInfo as > the trigger that is firing is for the partition, not the partitioned > table. I don't see any way to figure out that the trigger is being > fired because it's cascading an update of its parent partitioned > table... At a guess, it feels like there might be some fields missing > in AfterTriggerShared to figure this out. > > Any thoughts on this Amit? Thanks for the off-list heads-up, David. I'll need to play around with Dmitry's test case a bit before I can be sure, but I agree with your suspicion -- something's not right if a result rel without a root rel set ends up in a path where tuple conversion matters. -- Thanks, Amit Langote
On Fri, Oct 17, 2025 at 7:36 PM Amit Langote <amitlangote09@gmail.com> wrote: > On Fri, Oct 17, 2025 at 6:08 PM David Rowley <dgrowleyml@gmail.com> wrote: > > On Fri, 17 Oct 2025 at 14:21, David Rowley <dgrowleyml@gmail.com> wrote: > > > Thanks for the detailed report. It seems to have been caused by > > > ba9a7e3921. For some reason the ResultRelInfo.ri_RootResultRelInfo > > > isn't set for this partition which causes ExecGetChildToRootMap() to > > > think no translation is required. > > > > The problem is with the ResultRelInfo caching that's in > > ExecGetTriggerResultRel(). The code there tries looking into the > > estate's es_opened_result_relations, es_tuple_routing_result_relations > > and es_trig_target_relations Lists to see if the ResultRelInfo was > > created before. In the problem case the ResultRelInfo is found in > > es_trig_target_relations and unfortunately it's been set up by some > > code in afterTriggerInvokeEvents() which passes a NULL rootRelInfo. > > This means when ExecGetTriggerResultRel() is called again this time > > passing the correct rootRelInfo, the cached one that has the NULL > > ri_RootResultRelInfo is found and returned. > > > > This results in the ExecGetChildToRootMap() code doing the wrong thing > > because it sees a NULL ri_RootResultRelInfo therefore does not > > translate the slot into the slow format of the partitioned table. > > > > I've attached a patch which fixes the problem. I'm just not sure if > > it's the right fix for the problem. I suspect the real problem is down > > to the fact that ExecGetTriggerResultRel() passes a NULL rootRelInfo > > in the first place. I just don't see a good way to figure out what the > > parent table should be so we know to create a parent ResultRelInfo as > > the trigger that is firing is for the partition, not the partitioned > > table. I don't see any way to figure out that the trigger is being > > fired because it's cascading an update of its parent partitioned > > table... At a guess, it feels like there might be some fields missing > > in AfterTriggerShared to figure this out. > > > > Any thoughts on this Amit? > > Thanks for the off-list heads-up, David. > > I'll need to play around with Dmitry's test case a bit before I can be > sure, but I agree with your suspicion -- something's not right if a > result rel without a root rel set ends up in a path where tuple > conversion matters. What seems to be happening is that trigger firings of the FK child tables run under the parent table’s UPDATE EState, because the child table’s trigger events are queued under SPI in ri_triggers.c. That code doesn’t execute the child’s triggers directly -- it defers them to run later under the parent’s context. As a result, the EState used at execution time lacks child ResultRelInfos with a root rel set, and new ones end up being created and added to es_trig_target_relations. In this case, the first event that creates a rootless entry in es_trig_result_relations is the INSERT side of the cross-partition update on stages, fired for its FK into pipelines. The later UPDATE event is for builds, whose FK references stages. When that UPDATE runs, ExecGetTriggerResultRel() looks up the destination result rel for stages and finds the rootless one that was built for the earlier INSERT. Because its ri_RootResultRelInfo is NULL, ExecGetChildToRootMap() skips tuple translation and crashes. That seems to explain the “something off” we both suspected. For the fix, I tried a slightly different approach from your patch. Instead of updating the cached entry to always set ri_RootResultRelInfo = rootRelInfo on a cache hit, I made ExecGetTriggerResultRel() avoid reusing a rootless ResultRelInfo when a rooted one is expected. Concretely, I added assertions on the two primary lists and tightened the es_trig_target_relations lookup to only return an entry when ri_RootResultRelInfo == rootRelInfo or the caller’s rootRelInfo is NULL. That prevents the rootless destination built for the INSERT on stages -> pipelines from being reused for the later UPDATE from builds -> stages. This seems a bit safer since it keeps cached entries immutable and surfaces mismatches via asserts rather than mutating shared state. Thoughts? -- Thanks, Amit Langote
Вложения
On Tue, 21 Oct 2025 at 02:48, Amit Langote <amitlangote09@gmail.com> wrote:
> For the fix, I tried a slightly different approach from your patch.
> Instead of updating the cached entry to always set
> ri_RootResultRelInfo = rootRelInfo on a cache hit, I made
> ExecGetTriggerResultRel() avoid reusing a rootless ResultRelInfo when
> a rooted one is expected. Concretely, I added assertions on the two
> primary lists and tightened the es_trig_target_relations lookup to
> only return an entry when ri_RootResultRelInfo == rootRelInfo or the
> caller’s rootRelInfo is NULL. That prevents the rootless destination
> built for the INSERT on stages -> pipelines from being reused for the
> later UPDATE from builds -> stages.
foreach(l, estate->es_trig_target_relations)
{
rInfo = (ResultRelInfo *) lfirst(l);
- if (RelationGetRelid(rInfo->ri_RelationDesc) == relid)
+ if (RelationGetRelid(rInfo->ri_RelationDesc) == relid &&
+ (rInfo->ri_RootResultRelInfo == rootRelInfo ||
+ rootRelInfo == NULL))
I don't understand the || rootRelInfo == NULL part. This would break
if we first created the ResultRelInfo with a parent then asked for
another one with no parent.
> This seems a bit safer since it keeps cached entries immutable and
> surfaces mismatches via asserts rather than mutating shared state.
I think creating separate ResultRelInfos is probably a safer bet.
That's what I ended up doing in [1] after posting my initial patch
(which was intended to highlight the problem area.)
David
[1] https://github.com/postgres/postgres/compare/master...david-rowley:postgres:fix_resultrelinfo_caching
On Tue, Oct 21, 2025 at 4:28 AM David Rowley <dgrowleyml@gmail.com> wrote:
> On Tue, 21 Oct 2025 at 02:48, Amit Langote <amitlangote09@gmail.com> wrote:
> > For the fix, I tried a slightly different approach from your patch.
> > Instead of updating the cached entry to always set
> > ri_RootResultRelInfo = rootRelInfo on a cache hit, I made
> > ExecGetTriggerResultRel() avoid reusing a rootless ResultRelInfo when
> > a rooted one is expected. Concretely, I added assertions on the two
> > primary lists and tightened the es_trig_target_relations lookup to
> > only return an entry when ri_RootResultRelInfo == rootRelInfo or the
> > caller’s rootRelInfo is NULL. That prevents the rootless destination
> > built for the INSERT on stages -> pipelines from being reused for the
> > later UPDATE from builds -> stages.
>
> foreach(l, estate->es_trig_target_relations)
> {
> rInfo = (ResultRelInfo *) lfirst(l);
> - if (RelationGetRelid(rInfo->ri_RelationDesc) == relid)
> + if (RelationGetRelid(rInfo->ri_RelationDesc) == relid &&
> + (rInfo->ri_RootResultRelInfo == rootRelInfo ||
> + rootRelInfo == NULL))
>
> I don't understand the || rootRelInfo == NULL part. This would break
> if we first created the ResultRelInfo with a parent then asked for
> another one with no parent.
You are right, the || rootRelInfo == NULL check in
es_trig_target_relations was too permissive, it could indeed return a
rooted entry when the caller explicitly passed NULL. I’ve updated the
patch so that es_trig_target_relations now requires strict equality of
ri_RootResultRelInfo.
For es_opened_result_relations and es_tuple_routing_result_relations,
I kept the looser asserts. These lookups can be made with a NULL
rootRelInfo, as in the initial call within afterTriggerInvokeEvents()
that isn’t assuming a child context. In that case, returning a rooted
entry is harmless since it’s only used to fetch trigger metadata, not
for tuple translation. I didn’t turn those asserts into runtime
conditions, because doing so would prevent legitimate reuse in those
safe contexts.
> > This seems a bit safer since it keeps cached entries immutable and
> > surfaces mismatches via asserts rather than mutating shared state.
>
> I think creating separate ResultRelInfos is probably a safer bet.
> That's what I ended up doing in [1] after posting my initial patch
> (which was intended to highlight the problem area.)
>
> [1] https://github.com/postgres/postgres/compare/master...david-rowley:postgres:fix_resultrelinfo_caching
Ah, good to know. I borrowed a bit from your comment in that commit
to update the function header, so it now briefly explains the
rootRelInfo matching rules and the distinction between the lists.
Updated patch attached.
--
Thanks, Amit Langote
Вложения
On Tue, 21 Oct 2025 at 14:28, Amit Langote <amitlangote09@gmail.com> wrote: > > On Tue, Oct 21, 2025 at 4:28 AM David Rowley <dgrowleyml@gmail.com> wrote: > > I don't understand the || rootRelInfo == NULL part. This would break > > if we first created the ResultRelInfo with a parent then asked for > > another one with no parent. > > You are right, the || rootRelInfo == NULL check in > es_trig_target_relations was too permissive, it could indeed return a > rooted entry when the caller explicitly passed NULL. I’ve updated the > patch so that es_trig_target_relations now requires strict equality of > ri_RootResultRelInfo. Now I don't understand why you left the Asserts with the rootRelInfo == NULL check. If the ri_RootResultRelInfo changes compared to the cached one in the es_opened_result_relations or es_tuple_routing_result_relations then we're subseptable to the very same bug we're aiming to fix here. As far as I see it, there's two options: 1) Use Assert(rInfo->ri_RootResultRelInfo == rootRelInfo) for the es_opened_result_relations and es_tuple_routing_result_relations; or 2) Treat all 3 lists the same and put "&& rInfo->ri_RootResultRelInfo == rootRelInfo" into the "if" check. I was a bit uncertain if #1 can be done as I wasn't sure if we could have the ResultRelInfo we need for a partition movement originating from a trigger already in the es_opened_result_relations list, but I played around for a bit and I came up with the following which will Assert fail if we did the Asserts for #1. begin; create table pt (a int primary key, b int references pt (a) on update cascade on delete cascade) partition by list(a); create table pt1 partition of pt for values in(1); create table pt2 partition of pt for values in(2); insert into pt values(1,null); update pt set b = 1; update pt set b = 2; rollback; I'm pretty certain that #2 is the correct answer. I've attached a v4 patch which does #2, adds tests and fixes the outdated header comment in ExecGetTriggerResultRel(). David
Вложения
On Tue, Oct 21, 2025 at 2:20 PM David Rowley <dgrowleyml@gmail.com> wrote: > On Tue, 21 Oct 2025 at 14:28, Amit Langote <amitlangote09@gmail.com> wrote: > > > > On Tue, Oct 21, 2025 at 4:28 AM David Rowley <dgrowleyml@gmail.com> wrote: > > > I don't understand the || rootRelInfo == NULL part. This would break > > > if we first created the ResultRelInfo with a parent then asked for > > > another one with no parent. > > > > You are right, the || rootRelInfo == NULL check in > > es_trig_target_relations was too permissive, it could indeed return a > > rooted entry when the caller explicitly passed NULL. I’ve updated the > > patch so that es_trig_target_relations now requires strict equality of > > ri_RootResultRelInfo. > > Now I don't understand why you left the Asserts with the rootRelInfo > == NULL check. If the ri_RootResultRelInfo changes compared to the > cached one in the es_opened_result_relations or > es_tuple_routing_result_relations then we're subseptable to the very > same bug we're aiming to fix here. Hmm, I had assumed non-NULL mismatches could not occur within one EState, but with multiple ModifyTableStates (e.g., a CTE update plus the main update) the same child relid can be opened under different parent ResultRelInfos. For example: WITH cte AS (UPDATE parted_cp_update_bug_2_p1 SET a = a) UPDATE parted_cp_update_bug_1 SET a = 2 WHERE a = 1; In that case a cached entry’s ri_RootResultRelInfo can differ from the rootRelInfo passed in, so returning a relid match is unsafe (crashes under v3!). I agree the assert is not appropriate even ignoring the NULL clause. IOW, I agree with folding ri_RootResultRelInfo == rootRelInfo into the condition across all three lists (your #2). > I've attached a v4 patch which does #2, adds tests and fixes the > outdated header comment in ExecGetTriggerResultRel(). Not sure if you missed it, but I had added tests in v2/v3 mirroring Dmitry's case. In the attached v5, I’ve updated the test case to use the no-op CTE I mentioned above, trimmed the test case code a bit, and modified it to use a new schema like the surrounding tests. I also updated the comment in ExecGetTriggerResultRel() as you did, and moved the additional-check explanation into the header comment. -- Thanks, Amit Langote
Вложения
On Tue, 21 Oct 2025 at 22:12, Amit Langote <amitlangote09@gmail.com> wrote: > > On Tue, Oct 21, 2025 at 2:20 PM David Rowley <dgrowleyml@gmail.com> wrote: > > I've attached a v4 patch which does #2, adds tests and fixes the > > outdated header comment in ExecGetTriggerResultRel(). > > Not sure if you missed it, but I had added tests in v2/v3 mirroring > Dmitry's case. I did miss that. You didn't mention it in the email and I didn't look down far enough to see them. > In the attached v5, I’ve updated the test case to use the no-op CTE I > mentioned above, trimmed the test case code a bit, and modified it to > use a new schema like the surrounding tests. I also updated the > comment in ExecGetTriggerResultRel() as you did, and moved the > additional-check explanation into the header comment. Thanks. The CTE test seems worthwhile since it can reuse the same tables and results in using the es_opened_result_relations List. I've included it. I also tidied up the tests a bit more. I looked at your v5 test case and saw it hadn't gone through the same simplification process as I put my version through (making tables non-partitioned when they don't need to be to trigger the issue), so what's in v6 is my version. I've attached v6. If you have other proposed changes, would you be able to send them as a diff based on my patch? We seem to have got some split-brain and the patches have diverged a little and that resulted in some of my v4 changes being lost. I also checked to see if there were any changes around what's shown in the Triggers portion of EXPLAIN ANALYZE for Dmitry's query. Both patched and unpatched EXPLAIN ANALYZE are the same. I'm not yet clear on if it's possible something could be different there now for some other cases that cause additional ResultRelInfos to be added to the lists. David
Вложения
On Tue, Oct 21, 2025 at 8:22 PM David Rowley <dgrowleyml@gmail.com> wrote:
> On Tue, 21 Oct 2025 at 22:12, Amit Langote <amitlangote09@gmail.com> wrote:
> >
> > On Tue, Oct 21, 2025 at 2:20 PM David Rowley <dgrowleyml@gmail.com> wrote:
> > > I've attached a v4 patch which does #2, adds tests and fixes the
> > > outdated header comment in ExecGetTriggerResultRel().
> >
> > Not sure if you missed it, but I had added tests in v2/v3 mirroring
> > Dmitry's case.
>
> I did miss that. You didn't mention it in the email and I didn't look
> down far enough to see them.
Ok, sorry about that.
> > In the attached v5, I’ve updated the test case to use the no-op CTE I
> > mentioned above, trimmed the test case code a bit, and modified it to
> > use a new schema like the surrounding tests. I also updated the
> > comment in ExecGetTriggerResultRel() as you did, and moved the
> > additional-check explanation into the header comment.
>
> Thanks. The CTE test seems worthwhile since it can reuse the same
> tables and results in using the es_opened_result_relations List. I've
> included it. I also tidied up the tests a bit more. I looked at your
> v5 test case and saw it hadn't gone through the same simplification
> process as I put my version through (making tables non-partitioned
> when they don't need to be to trigger the issue), so what's in v6 is
> my version.
Ah, yes, the 3rd table doesn't need to be partitioned.
> I've attached v6. If you have other proposed changes, would you be
> able to send them as a diff based on my patch? We seem to have got
> some split-brain
Sorry about that.
> and the patches have diverged a little and that
> resulted in some of my v4 changes being lost.
Besides test suite being entirely different in my v5, the only change
from your v4 was rewording and moving the new comment in
ExecGetTriggerResultRel() into its header comment:
@@ -1337,6 +1337,12 @@ InitResultRelInfo(ResultRelInfo *resultRelInfo,
* also provides a way for EXPLAIN ANALYZE to report the runtimes of such
* triggers.) So we make additional ResultRelInfo's as needed, and save them
* in es_trig_target_relations.
+ *
+ * When reusing cached entries from any of these lists, require an exact
+ * match of ri_RootResultRelInfo and the rootRelInfo passed by the caller.
+ * This avoids returning an entry created for a different parent context
+ * and ensures child->parent translation is neither skipped nor applied
+ * incorrectly. If no exact match exists, build a new ResultRelInfo.
*/
ResultRelInfo *
ExecGetTriggerResultRel(EState *estate, Oid relid,
@@ -1347,14 +1353,6 @@ ExecGetTriggerResultRel(EState *estate, Oid relid,
Relation rel;
MemoryContext oldcontext;
- /*
- * Before creating a new ResultRelInfo, check if we've already made and
- * cached one for this relation. We must ensure that the given
- * 'rootRelInfo' matches the one stored in the cached ResultRelInfo as
- * trigger handling for partitions can result in mixed requirements for
- * what ri_RootResultRelInfo is set to.
- */
-
Please take it if you think that's a good change. That said, I don't
have any issue with your v6, code or tests.
> I also checked to see if there were any changes around what's shown in
> the Triggers portion of EXPLAIN ANALYZE for Dmitry's query. Both
> patched and unpatched EXPLAIN ANALYZE are the same. I'm not yet clear
> on if it's possible something could be different there now for some
> other cases that cause additional ResultRelInfos to be added to the
> lists.
Good point.
I made report_triggers() print which list each ResultRelInfo came from
(with the attached). It looks like the entries that could have been
affected are always in es_trig_target_relations anyway, so the change
in ExecGetTriggerResultRel() doesn’t alter what EXPLAIN shows under
“Triggers.” I was still a bit surprised, since I expected there might
now be an extra entry after we stopped reusing a child entry with a
mismatching ri_RootResultRelInfo.
I do see child relations listed under “Triggers,” but only when they
have trigger events of their own, not when they’re processed as child
relations of a root after a CP_UPDATE event. In those cases,
afterTriggerInvokeEvents() only passes the instrumentation for the
root rel to AfterTriggerExecute(), so the child entries don’t have
their ri_TrigInstrument updated and don’t appear in the EXPLAIN
output.
--
Thanks, Amit Langote
Вложения
On Wed, 22 Oct 2025 at 02:49, Amit Langote <amitlangote09@gmail.com> wrote: > > On Tue, Oct 21, 2025 at 8:22 PM David Rowley <dgrowleyml@gmail.com> wrote: > > On Tue, 21 Oct 2025 at 22:12, Amit Langote <amitlangote09@gmail.com> wrote: > > > > > > On Tue, Oct 21, 2025 at 2:20 PM David Rowley <dgrowleyml@gmail.com> wrote: > > > > I've attached a v4 patch which does #2, adds tests and fixes the > > > > outdated header comment in ExecGetTriggerResultRel(). > > > > > > Not sure if you missed it, but I had added tests in v2/v3 mirroring > > > Dmitry's case. > > > > I did miss that. You didn't mention it in the email and I didn't look > > down far enough to see them. > > Ok, sorry about that. > > > > In the attached v5, I’ve updated the test case to use the no-op CTE I > > > mentioned above, trimmed the test case code a bit, and modified it to > > > use a new schema like the surrounding tests. I also updated the > > > comment in ExecGetTriggerResultRel() as you did, and moved the > > > additional-check explanation into the header comment. > > > > Thanks. The CTE test seems worthwhile since it can reuse the same > > tables and results in using the es_opened_result_relations List. I've > > included it. I also tidied up the tests a bit more. I looked at your > > v5 test case and saw it hadn't gone through the same simplification > > process as I put my version through (making tables non-partitioned > > when they don't need to be to trigger the issue), so what's in v6 is > > my version. > > Ah, yes, the 3rd table doesn't need to be partitioned. > > > I've attached v6. If you have other proposed changes, would you be > > able to send them as a diff based on my patch? We seem to have got > > some split-brain > > Sorry about that. > > > and the patches have diverged a little and that > > resulted in some of my v4 changes being lost. > > Besides test suite being entirely different in my v5, the only change > from your v4 was rewording and moving the new comment in > ExecGetTriggerResultRel() into its header comment: > > @@ -1337,6 +1337,12 @@ InitResultRelInfo(ResultRelInfo *resultRelInfo, > * also provides a way for EXPLAIN ANALYZE to report the runtimes of such > * triggers.) So we make additional ResultRelInfo's as needed, and save them > * in es_trig_target_relations. > + * > + * When reusing cached entries from any of these lists, require an exact > + * match of ri_RootResultRelInfo and the rootRelInfo passed by the caller. > + * This avoids returning an entry created for a different parent context > + * and ensures child->parent translation is neither skipped nor applied > + * incorrectly. If no exact match exists, build a new ResultRelInfo. > */ > ResultRelInfo * > ExecGetTriggerResultRel(EState *estate, Oid relid, > @@ -1347,14 +1353,6 @@ ExecGetTriggerResultRel(EState *estate, Oid relid, > Relation rel; > MemoryContext oldcontext; > > - /* > - * Before creating a new ResultRelInfo, check if we've already made and > - * cached one for this relation. We must ensure that the given > - * 'rootRelInfo' matches the one stored in the cached ResultRelInfo as > - * trigger handling for partitions can result in mixed requirements for > - * what ri_RootResultRelInfo is set to. > - */ > - > > Please take it if you think that's a good change. That said, I don't > have any issue with your v6, code or tests. The reason I like the comment being within the function body is that I didn't see any need to tell the caller to care about this. As far as the caller should be concerned, the function should return a correctly set up ResultRelInfo. The inner details of how the caching works feels like it should be implementation details, and therefore comments about that feel better placed inside the function body. > > I also checked to see if there were any changes around what's shown in > > the Triggers portion of EXPLAIN ANALYZE for Dmitry's query. Both > > patched and unpatched EXPLAIN ANALYZE are the same. I'm not yet clear > > on if it's possible something could be different there now for some > > other cases that cause additional ResultRelInfos to be added to the > > lists. > > Good point. > > I made report_triggers() print which list each ResultRelInfo came from > (with the attached). It looks like the entries that could have been > affected are always in es_trig_target_relations anyway, so the change > in ExecGetTriggerResultRel() doesn’t alter what EXPLAIN shows under > “Triggers.” I was still a bit surprised, since I expected there might > now be an extra entry after we stopped reusing a child entry with a > mismatching ri_RootResultRelInfo. > > I do see child relations listed under “Triggers,” but only when they > have trigger events of their own, not when they’re processed as child > relations of a root after a CP_UPDATE event. In those cases, > afterTriggerInvokeEvents() only passes the instrumentation for the > root rel to AfterTriggerExecute(), so the child entries don’t have > their ri_TrigInstrument updated and don’t appear in the EXPLAIN > output. That's good to know, so sounds like we're safe from listing any additional ResultRelInfos there. I'm happy to go with v6. David
On Wed, Oct 22, 2025 at 5:29 AM David Rowley <dgrowleyml@gmail.com> wrote: > On Wed, 22 Oct 2025 at 02:49, Amit Langote <amitlangote09@gmail.com> wrote: > > Besides test suite being entirely different in my v5, the only change > > from your v4 was rewording and moving the new comment in > > ExecGetTriggerResultRel() into its header comment: > > > > @@ -1337,6 +1337,12 @@ InitResultRelInfo(ResultRelInfo *resultRelInfo, > > * also provides a way for EXPLAIN ANALYZE to report the runtimes of such > > * triggers.) So we make additional ResultRelInfo's as needed, and save them > > * in es_trig_target_relations. > > + * > > + * When reusing cached entries from any of these lists, require an exact > > + * match of ri_RootResultRelInfo and the rootRelInfo passed by the caller. > > + * This avoids returning an entry created for a different parent context > > + * and ensures child->parent translation is neither skipped nor applied > > + * incorrectly. If no exact match exists, build a new ResultRelInfo. > > */ > > ResultRelInfo * > > ExecGetTriggerResultRel(EState *estate, Oid relid, > > @@ -1347,14 +1353,6 @@ ExecGetTriggerResultRel(EState *estate, Oid relid, > > Relation rel; > > MemoryContext oldcontext; > > > > - /* > > - * Before creating a new ResultRelInfo, check if we've already made and > > - * cached one for this relation. We must ensure that the given > > - * 'rootRelInfo' matches the one stored in the cached ResultRelInfo as > > - * trigger handling for partitions can result in mixed requirements for > > - * what ri_RootResultRelInfo is set to. > > - */ > > - > > > > Please take it if you think that's a good change. That said, I don't > > have any issue with your v6, code or tests. > > The reason I like the comment being within the function body is that I > didn't see any need to tell the caller to care about this. As far as > the caller should be concerned, the function should return a correctly > set up ResultRelInfo. The inner details of how the caching works feels > like it should be implementation details, and therefore comments about > that feel better placed inside the function body. Makes sense. > > > I also checked to see if there were any changes around what's shown in > > > the Triggers portion of EXPLAIN ANALYZE for Dmitry's query. Both > > > patched and unpatched EXPLAIN ANALYZE are the same. I'm not yet clear > > > on if it's possible something could be different there now for some > > > other cases that cause additional ResultRelInfos to be added to the > > > lists. > > > > Good point. > > > > I made report_triggers() print which list each ResultRelInfo came from > > (with the attached). It looks like the entries that could have been > > affected are always in es_trig_target_relations anyway, so the change > > in ExecGetTriggerResultRel() doesn’t alter what EXPLAIN shows under > > “Triggers.” I was still a bit surprised, since I expected there might > > now be an extra entry after we stopped reusing a child entry with a > > mismatching ri_RootResultRelInfo. > > > > I do see child relations listed under “Triggers,” but only when they > > have trigger events of their own, not when they’re processed as child > > relations of a root after a CP_UPDATE event. In those cases, > > afterTriggerInvokeEvents() only passes the instrumentation for the > > root rel to AfterTriggerExecute(), so the child entries don’t have > > their ri_TrigInstrument updated and don’t appear in the EXPLAIN > > output. > > That's good to know, so sounds like we're safe from listing any > additional ResultRelInfos there. I think so. > I'm happy to go with v6. It looks good to me as well. I also thought about whether the test case could be simplified further but didn’t come up with anything that still reproduces the failure cleanly. I then tried to understand why we end up with an INSERT event on stages_50_like (queued due to the FK from stages into pipelines) during the cascaded update on its parent, which creates the rootless ResultRelInfo in es_trig_target_relations. That same entry was later reused for the parent’s CP_UPDATE event (queued due to the FK from builds into stages), leading to the crash. I was curious why we queue that INSERT event instead of a CP_UPDATE on the parent, since the latter would have avoided the problem. When I hacked the code to do that, the crash disappeared -- but that’s not a valid fix, since FK-side triggers can’t be used with partitioned tables; they assume heap relations, so queuing CP_UPDATE events for FK-side triggers doesn’t work in general. Anyway, just thought to share that here for the record, since it helped me understand how that rootless ResultRelInfo ends up being created in the first place. -- Thanks, Amit Langote
On Fri, 24 Oct 2025 at 01:08, Amit Langote <amitlangote09@gmail.com> wrote: > > On Wed, Oct 22, 2025 at 5:29 AM David Rowley <dgrowleyml@gmail.com> wrote: > > I'm happy to go with v6. > > It looks good to me as well. Thanks for checking. Pushed. David
On Sun, Oct 26, 2025 at 7:04 AM David Rowley <dgrowleyml@gmail.com> wrote: > On Fri, 24 Oct 2025 at 01:08, Amit Langote <amitlangote09@gmail.com> wrote: > > > > On Wed, Oct 22, 2025 at 5:29 AM David Rowley <dgrowleyml@gmail.com> wrote: > > > I'm happy to go with v6. > > > > It looks good to me as well. > > Thanks for checking. Pushed. Thanks David. -- Thanks, Amit Langote