Re: Adding OLD/NEW support to RETURNING
От | Dean Rasheed |
---|---|
Тема | Re: Adding OLD/NEW support to RETURNING |
Дата | |
Msg-id | CAEZATCWDGAd0mQgtXjgnEJ_Hf0v3hoMpQty7=nn9CFG3fc1HRA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Adding OLD/NEW support to RETURNING (Dean Rasheed <dean.a.rasheed@gmail.com>) |
Список | pgsql-hackers |
On Mon, 18 Mar 2024 at 10:48, Dean Rasheed <dean.a.rasheed@gmail.com> wrote: > > Rebased version attached, on top of c649fa24a4 (MERGE ... RETURNING support). > I have been doing more testing of this and I realised that there was a problem -- the previous patch worked fine when updating a regular table, so that old/new.colname is just a Var, but when updating an auto-updatable view, "colname" could end up being replaced by an arbitrary expression. In the cases I had tested before, that appeared to work OK, but actually it wasn't right in all cases where the result should have been NULL, due to the old/new row being absent (e.g., the old row in an INSERT). After thinking about that for a while, the best solution seemed to be to add a new executable node, which I've called ReturningExpr. This evaluates the old/new expression if the old/new row exists, but skips it and returns NULL if the old/new row doesn't exist. The simplest example is a query like this, which now returns what I would expect: DROP TABLE IF EXISTS tt CASCADE; CREATE TABLE tt (a int PRIMARY KEY, b text); INSERT INTO tt VALUES (1, 'R1'); CREATE VIEW tv AS SELECT a, b, 'Const' c FROM tt; INSERT INTO tv VALUES (1, 'Row 1'), (2, 'Row 2') ON CONFLICT (a) DO UPDATE SET b = excluded.b RETURNING old.*, new.*; a | b | c | a | b | c ---+----+-------+---+-------+------- 1 | R1 | Const | 1 | Row 1 | Const | | | 2 | Row 2 | Const (2 rows) (Previously that was returning old.c = 'Const' in both rows, because the Const node has no old/new qualification.) In EXPLAIN, I opted to display this as "old/new.(expression)", to make it clear that the expression is being evaluated in the context of the old/new row, even if it doesn't directly refer to old/new values from the table. So, for example, the plan for the above query looks like this: QUERY PLAN -------------------------------------------------------------------------------- Insert on public.tt Output: old.a, old.b, old.('Const'::text), new.a, new.b, new.('Const'::text) Conflict Resolution: UPDATE Conflict Arbiter Indexes: tt_pkey -> Values Scan on "*VALUES*" Output: "*VALUES*".column1, "*VALUES*".column2 (It can't output "old.c" or "new.c" because all knowledge of the view column "c" is gone by the time it has been through the rewriter, and in any case, the details of the expression being evaluated are likely to be useful in general.) Things get more complicated when subqueries are involved. For example, given this view definition: CREATE VIEW tv AS SELECT a, b, (SELECT concat('b=',b)) c FROM tt; the INSERT above produces this: a | b | c | a | b | c ---+----+------+---+-------+--------- 1 | R1 | b=R1 | 1 | Row 1 | b=Row 1 | | | 2 | Row 2 | b=Row 2 (2 rows) which is as expected. This uses the following query plan: QUERY PLAN ---------------------------------------------------------------------------- Insert on public.tt Output: old.a, old.b, old.((SubPlan 1)), new.a, new.b, new.((SubPlan 2)) Conflict Resolution: UPDATE Conflict Arbiter Indexes: tt_pkey -> Values Scan on "*VALUES*" Output: "*VALUES*".column1, "*VALUES*".column2 SubPlan 1 -> Result Output: concat('b=', old.b) SubPlan 2 -> Result Output: concat('b=', new.b) In this case "b" in the view subquery becomes "old.b" in SubPlan 1 and "new.b" in SubPlan 2 (each with varlevelsup = 1, and therefore evaluated as input params to the subplans). The concat() result would normally always be non-NULL, but it (or rather the SubLink subquery containing it) is wrapped in a ReturningExpr. As a result, SubPlan 1 is skipped in the second row, for which old does not exist, and ends up only being executed once in that query, whereas SubPlan 2 is executed twice. Things get even more fiddly when the old/new expression itself appears in a subquery. For example, given the following query: INSERT INTO tv VALUES (1, 'Row 1'), (2, 'Row 2') ON CONFLICT (a) DO UPDATE SET b = excluded.b RETURNING old.a, old.b, (SELECT old.c), new.*; the result is the same, but the query plan is now QUERY PLAN ---------------------------------------------------------------------- Insert on public.tt Output: old.a, old.b, (SubPlan 2), new.a, new.b, new.((SubPlan 3)) Conflict Resolution: UPDATE Conflict Arbiter Indexes: tt_pkey -> Values Scan on "*VALUES*" Output: "*VALUES*".column1, "*VALUES*".column2 SubPlan 1 -> Result Output: concat('b=', old.b) SubPlan 2 -> Result Output: (old.((SubPlan 1))) SubPlan 3 -> Result Output: concat('b=', new.b) The ReturningExpr nodes belong to the query level containing the RETURNING list (hence they have a "levelsup" field, like Var, PlaceHolderVar, etc.). So in this example, one of the ReturningExpr nodes is in SubPlan 2, with "levelsup" = 1, wrapping SubPlan 1, i.e., it only executes SubPlan 1 if the old row exists. Although that all sounds quite complicated, all the individual pieces are quite simple. Attached is an updated patch in which I have also tidied up a few other things, but I haven't read your latest review comments yet. I'll respond to those separately. Regards, Dean
Вложения
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Bertrand DrouvotДата:
Сообщение: Re: Introduce XID age and inactive timeout based replication slot invalidation
Следующее
От: Bertrand DrouvotДата:
Сообщение: Re: Introduce XID age and inactive timeout based replication slot invalidation