Re: Adding OLD/NEW support to RETURNING

Поиск
Список
Период
Сортировка
От Dean Rasheed
Тема Re: Adding OLD/NEW support to RETURNING
Дата
Msg-id CAEZATCWDGAd0mQgtXjgnEJ_Hf0v3hoMpQty7=nn9CFG3fc1HRA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Adding OLD/NEW support to RETURNING  (Dean Rasheed <dean.a.rasheed@gmail.com>)
Список pgsql-hackers
On Mon, 18 Mar 2024 at 10:48, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
>
> Rebased version attached, on top of c649fa24a4 (MERGE ... RETURNING support).
>

I have been doing more testing of this and I realised that there was a
problem -- the previous patch worked fine when updating a regular
table, so that old/new.colname is just a Var, but when updating an
auto-updatable view, "colname" could end up being replaced by an
arbitrary expression. In the cases I had tested before, that appeared
to work OK, but actually it wasn't right in all cases where the result
should have been NULL, due to the old/new row being absent (e.g., the
old row in an INSERT).

After thinking about that for a while, the best solution seemed to be
to add a new executable node, which I've called ReturningExpr. This
evaluates the old/new expression if the old/new row exists, but skips
it and returns NULL if the old/new row doesn't exist. The simplest
example is a query like this, which now returns what I would expect:

DROP TABLE IF EXISTS tt CASCADE;
CREATE TABLE tt (a int PRIMARY KEY, b text);
INSERT INTO tt VALUES (1, 'R1');
CREATE VIEW tv AS SELECT a, b, 'Const' c FROM tt;

INSERT INTO tv VALUES (1, 'Row 1'), (2, 'Row 2')
  ON CONFLICT (a) DO UPDATE SET b = excluded.b
  RETURNING old.*, new.*;

 a | b  |   c   | a |   b   |   c
---+----+-------+---+-------+-------
 1 | R1 | Const | 1 | Row 1 | Const
   |    |       | 2 | Row 2 | Const
(2 rows)

(Previously that was returning old.c = 'Const' in both rows, because
the Const node has no old/new qualification.)

In EXPLAIN, I opted to display this as "old/new.(expression)", to make
it clear that the expression is being evaluated in the context of the
old/new row, even if it doesn't directly refer to old/new values from
the table. So, for example, the plan for the above query looks like
this:

                                   QUERY PLAN
--------------------------------------------------------------------------------
 Insert on public.tt
   Output: old.a, old.b, old.('Const'::text), new.a, new.b, new.('Const'::text)
   Conflict Resolution: UPDATE
   Conflict Arbiter Indexes: tt_pkey
   ->  Values Scan on "*VALUES*"
         Output: "*VALUES*".column1, "*VALUES*".column2

(It can't output "old.c" or "new.c" because all knowledge of the view
column "c" is gone by the time it has been through the rewriter, and
in any case, the details of the expression being evaluated are likely
to be useful in general.)

Things get more complicated when subqueries are involved. For example,
given this view definition:

CREATE VIEW tv AS SELECT a, b, (SELECT concat('b=',b)) c FROM tt;

the INSERT above produces this:

 a | b  |  c   | a |   b   |    c
---+----+------+---+-------+---------
 1 | R1 | b=R1 | 1 | Row 1 | b=Row 1
   |    |      | 2 | Row 2 | b=Row 2
(2 rows)

which is as expected. This uses the following query plan:

                                 QUERY PLAN
----------------------------------------------------------------------------
 Insert on public.tt
   Output: old.a, old.b, old.((SubPlan 1)), new.a, new.b, new.((SubPlan 2))
   Conflict Resolution: UPDATE
   Conflict Arbiter Indexes: tt_pkey
   ->  Values Scan on "*VALUES*"
         Output: "*VALUES*".column1, "*VALUES*".column2
   SubPlan 1
     ->  Result
           Output: concat('b=', old.b)
   SubPlan 2
     ->  Result
           Output: concat('b=', new.b)

In this case "b" in the view subquery becomes "old.b" in SubPlan 1 and
"new.b" in SubPlan 2 (each with varlevelsup = 1, and therefore
evaluated as input params to the subplans). The concat() result would
normally always be non-NULL, but it (or rather the SubLink subquery
containing it) is wrapped in a ReturningExpr. As a result, SubPlan 1
is skipped in the second row, for which old does not exist, and ends
up only being executed once in that query, whereas SubPlan 2 is
executed twice.

Things get even more fiddly when the old/new expression itself appears
in a subquery. For example, given the following query:

INSERT INTO tv VALUES (1, 'Row 1'), (2, 'Row 2')
  ON CONFLICT (a) DO UPDATE SET b = excluded.b
  RETURNING old.a, old.b, (SELECT old.c), new.*;

the result is the same, but the query plan is now

                              QUERY PLAN
----------------------------------------------------------------------
 Insert on public.tt
   Output: old.a, old.b, (SubPlan 2), new.a, new.b, new.((SubPlan 3))
   Conflict Resolution: UPDATE
   Conflict Arbiter Indexes: tt_pkey
   ->  Values Scan on "*VALUES*"
         Output: "*VALUES*".column1, "*VALUES*".column2
   SubPlan 1
     ->  Result
           Output: concat('b=', old.b)
   SubPlan 2
     ->  Result
           Output: (old.((SubPlan 1)))
   SubPlan 3
     ->  Result
           Output: concat('b=', new.b)

The ReturningExpr nodes belong to the query level containing the
RETURNING list (hence they have a "levelsup" field, like Var,
PlaceHolderVar, etc.). So in this example, one of the ReturningExpr
nodes is in SubPlan 2, with "levelsup" = 1, wrapping SubPlan 1, i.e.,
it only executes SubPlan 1 if the old row exists.

Although that all sounds quite complicated, all the individual pieces
are quite simple.

Attached is an updated patch in which I have also tidied up a few
other things, but I haven't read your latest review comments yet. I'll
respond to those separately.

Regards,
Dean

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Bertrand Drouvot
Дата:
Сообщение: Re: Introduce XID age and inactive timeout based replication slot invalidation
Следующее
От: Bertrand Drouvot
Дата:
Сообщение: Re: Introduce XID age and inactive timeout based replication slot invalidation