Обсуждение: MERGE bug report

Поиск
Список
Период
Сортировка

MERGE bug report

От
"Joe Wildish"
Дата:
Hello Hackers,

Reporting a bug with the new MERGE statement. Tested against 75edb919613ee835e7680e40137e494c7856bcf9.

psql output as follows:

...
psql:merge.sql:33: ERROR:  variable not found in subplan target lists
ROLLBACK
[local] joe@joe=# \errverbose
ERROR:  XX000: variable not found in subplan target lists
LOCATION:  fix_join_expr_mutator, setrefs.c:2800

Stack trace:

fix_join_expr_mutator setrefs.c:2800
expression_tree_mutator nodeFuncs.c:3348
fix_join_expr_mutator setrefs.c:2853
expression_tree_mutator nodeFuncs.c:2992
fix_join_expr_mutator setrefs.c:2853
expression_tree_mutator nodeFuncs.c:3348
fix_join_expr_mutator setrefs.c:2853
fix_join_expr setrefs.c:2753
set_plan_refs setrefs.c:1085
set_plan_references setrefs.c:315
standard_planner planner.c:498
planner planner.c:277
pg_plan_query postgres.c:883
pg_plan_queries postgres.c:975
exec_simple_query postgres.c:1169
PostgresMain postgres.c:4520
BackendRun postmaster.c:4593
BackendStartup postmaster.c:4321
ServerLoop postmaster.c:1801
PostmasterMain postmaster.c:1473
main main.c:202
__libc_start_main 0x00007fc4ccc0b1e2
_start 0x000000000048804e

Reproducer script:

BEGIN;
DROP TABLE IF EXISTS item, incoming, source CASCADE;

CREATE TABLE item
  (order_id    INTEGER NOT NULL,
   item_id     INTEGER NOT NULL,
   quantity    INTEGER NOT NULL,
   price       NUMERIC NOT NULL,
   CONSTRAINT pk_item PRIMARY KEY (order_id, item_id));

INSERT INTO item VALUES (100, 1, 4, 100.00), (100, 2, 9, 199.00);

CREATE TABLE incoming (order_id, item_id, quantity, price)
  AS (VALUES (100, 1, 4, 100.00), (100, 3, 1, 200.00));

CREATE TABLE source (order_id, item_id, quantity, price) AS
  (SELECT order_id, item_id, incoming.quantity, incoming.price
     FROM item LEFT JOIN incoming USING (order_id, item_id));

MERGE INTO item a
USING source b
   ON (a.order_id, a.item_id) =
      (b.order_id, b.item_id)
 WHEN NOT MATCHED
 THEN INSERT (order_id, item_id, quantity, price)
      VALUES (order_id, item_id, quantity, price)
 WHEN MATCHED
  AND a.* IS DISTINCT FROM b.*
 THEN UPDATE SET (quantity, price) = (b.quantity, b.price)
 WHEN MATCHED
  AND (b.quantity IS NULL AND b.price IS NULL)
 THEN DELETE;
COMMIT;

It seems related to the use of a.* and b.*

Sorry I can't be more specific. Error manifests when planning occurs and that is well outside of my code base
knowledge.

Hope this helps.

Cheers,
-Joe



Re: MERGE bug report

От
Zhihong Yu
Дата:


On Tue, Apr 5, 2022 at 3:18 PM Joe Wildish <joe@lateraljoin.com> wrote:
Hello Hackers,

Reporting a bug with the new MERGE statement. Tested against 75edb919613ee835e7680e40137e494c7856bcf9.

psql output as follows:

...
psql:merge.sql:33: ERROR:  variable not found in subplan target lists
ROLLBACK
[local] joe@joe=# \errverbose
ERROR:  XX000: variable not found in subplan target lists
LOCATION:  fix_join_expr_mutator, setrefs.c:2800

Stack trace:

fix_join_expr_mutator setrefs.c:2800
expression_tree_mutator nodeFuncs.c:3348
fix_join_expr_mutator setrefs.c:2853
expression_tree_mutator nodeFuncs.c:2992
fix_join_expr_mutator setrefs.c:2853
expression_tree_mutator nodeFuncs.c:3348
fix_join_expr_mutator setrefs.c:2853
fix_join_expr setrefs.c:2753
set_plan_refs setrefs.c:1085
set_plan_references setrefs.c:315
standard_planner planner.c:498
planner planner.c:277
pg_plan_query postgres.c:883
pg_plan_queries postgres.c:975
exec_simple_query postgres.c:1169
PostgresMain postgres.c:4520
BackendRun postmaster.c:4593
BackendStartup postmaster.c:4321
ServerLoop postmaster.c:1801
PostmasterMain postmaster.c:1473
main main.c:202
__libc_start_main 0x00007fc4ccc0b1e2
_start 0x000000000048804e

Reproducer script:

BEGIN;
DROP TABLE IF EXISTS item, incoming, source CASCADE;

CREATE TABLE item
  (order_id    INTEGER NOT NULL,
   item_id     INTEGER NOT NULL,
   quantity    INTEGER NOT NULL,
   price       NUMERIC NOT NULL,
   CONSTRAINT pk_item PRIMARY KEY (order_id, item_id));

INSERT INTO item VALUES (100, 1, 4, 100.00), (100, 2, 9, 199.00);

CREATE TABLE incoming (order_id, item_id, quantity, price)
  AS (VALUES (100, 1, 4, 100.00), (100, 3, 1, 200.00));

CREATE TABLE source (order_id, item_id, quantity, price) AS
  (SELECT order_id, item_id, incoming.quantity, incoming.price
     FROM item LEFT JOIN incoming USING (order_id, item_id));

MERGE INTO item a
USING source b
   ON (a.order_id, a.item_id) =
      (b.order_id, b.item_id)
 WHEN NOT MATCHED
 THEN INSERT (order_id, item_id, quantity, price)
      VALUES (order_id, item_id, quantity, price)
 WHEN MATCHED
  AND a.* IS DISTINCT FROM b.*
 THEN UPDATE SET (quantity, price) = (b.quantity, b.price)
 WHEN MATCHED
  AND (b.quantity IS NULL AND b.price IS NULL)
 THEN DELETE;
COMMIT;

It seems related to the use of a.* and b.*

Sorry I can't be more specific. Error manifests when planning occurs and that is well outside of my code base knowledge.

Hope this helps.

Cheers,
-Joe
Hi,
It seems all the calls to fix_join_expr_mutator() are within setrefs.c

I haven't found where in nodeFuncs.c fix_join_expr_mutator is called.

I am on commit 75edb919613ee835e7680e40137e494c7856bcf9 . 

Re: MERGE bug report

От
Zhihong Yu
Дата:


On Tue, Apr 5, 2022 at 3:35 PM Zhihong Yu <zyu@yugabyte.com> wrote:


On Tue, Apr 5, 2022 at 3:18 PM Joe Wildish <joe@lateraljoin.com> wrote:
Hello Hackers,

Reporting a bug with the new MERGE statement. Tested against 75edb919613ee835e7680e40137e494c7856bcf9.

psql output as follows:

...
psql:merge.sql:33: ERROR:  variable not found in subplan target lists
ROLLBACK
[local] joe@joe=# \errverbose
ERROR:  XX000: variable not found in subplan target lists
LOCATION:  fix_join_expr_mutator, setrefs.c:2800

Stack trace:

fix_join_expr_mutator setrefs.c:2800
expression_tree_mutator nodeFuncs.c:3348
fix_join_expr_mutator setrefs.c:2853
expression_tree_mutator nodeFuncs.c:2992
fix_join_expr_mutator setrefs.c:2853
expression_tree_mutator nodeFuncs.c:3348
fix_join_expr_mutator setrefs.c:2853
fix_join_expr setrefs.c:2753
set_plan_refs setrefs.c:1085
set_plan_references setrefs.c:315
standard_planner planner.c:498
planner planner.c:277
pg_plan_query postgres.c:883
pg_plan_queries postgres.c:975
exec_simple_query postgres.c:1169
PostgresMain postgres.c:4520
BackendRun postmaster.c:4593
BackendStartup postmaster.c:4321
ServerLoop postmaster.c:1801
PostmasterMain postmaster.c:1473
main main.c:202
__libc_start_main 0x00007fc4ccc0b1e2
_start 0x000000000048804e

Reproducer script:

BEGIN;
DROP TABLE IF EXISTS item, incoming, source CASCADE;

CREATE TABLE item
  (order_id    INTEGER NOT NULL,
   item_id     INTEGER NOT NULL,
   quantity    INTEGER NOT NULL,
   price       NUMERIC NOT NULL,
   CONSTRAINT pk_item PRIMARY KEY (order_id, item_id));

INSERT INTO item VALUES (100, 1, 4, 100.00), (100, 2, 9, 199.00);

CREATE TABLE incoming (order_id, item_id, quantity, price)
  AS (VALUES (100, 1, 4, 100.00), (100, 3, 1, 200.00));

CREATE TABLE source (order_id, item_id, quantity, price) AS
  (SELECT order_id, item_id, incoming.quantity, incoming.price
     FROM item LEFT JOIN incoming USING (order_id, item_id));

MERGE INTO item a
USING source b
   ON (a.order_id, a.item_id) =
      (b.order_id, b.item_id)
 WHEN NOT MATCHED
 THEN INSERT (order_id, item_id, quantity, price)
      VALUES (order_id, item_id, quantity, price)
 WHEN MATCHED
  AND a.* IS DISTINCT FROM b.*
 THEN UPDATE SET (quantity, price) = (b.quantity, b.price)
 WHEN MATCHED
  AND (b.quantity IS NULL AND b.price IS NULL)
 THEN DELETE;
COMMIT;

It seems related to the use of a.* and b.*

Sorry I can't be more specific. Error manifests when planning occurs and that is well outside of my code base knowledge.

Hope this helps.

Cheers,
-Joe
Hi,
It seems all the calls to fix_join_expr_mutator() are within setrefs.c

I haven't found where in nodeFuncs.c fix_join_expr_mutator is called.

I am on commit 75edb919613ee835e7680e40137e494c7856bcf9 . 

Pardon - I typed too fast:

The call to fix_join_expr_mutator() is on this line (3348):

                    resultlist = lappend(resultlist,
                                         mutator((Node *) lfirst(temp),
                                                 context)); 

Re: MERGE bug report

От
Richard Guo
Дата:

On Wed, Apr 6, 2022 at 6:18 AM Joe Wildish <joe@lateraljoin.com> wrote:
Hello Hackers,

Reporting a bug with the new MERGE statement. Tested against 75edb919613ee835e7680e40137e494c7856bcf9.

psql output as follows:

...
psql:merge.sql:33: ERROR:  variable not found in subplan target lists
ROLLBACK
[local] joe@joe=# \errverbose
ERROR:  XX000: variable not found in subplan target lists
LOCATION:  fix_join_expr_mutator, setrefs.c:2800

Stack trace:

fix_join_expr_mutator setrefs.c:2800
expression_tree_mutator nodeFuncs.c:3348
fix_join_expr_mutator setrefs.c:2853
expression_tree_mutator nodeFuncs.c:2992
fix_join_expr_mutator setrefs.c:2853
expression_tree_mutator nodeFuncs.c:3348
fix_join_expr_mutator setrefs.c:2853
fix_join_expr setrefs.c:2753
set_plan_refs setrefs.c:1085
set_plan_references setrefs.c:315
standard_planner planner.c:498
planner planner.c:277
pg_plan_query postgres.c:883
pg_plan_queries postgres.c:975
exec_simple_query postgres.c:1169
PostgresMain postgres.c:4520
BackendRun postmaster.c:4593
BackendStartup postmaster.c:4321
ServerLoop postmaster.c:1801
PostmasterMain postmaster.c:1473
main main.c:202
__libc_start_main 0x00007fc4ccc0b1e2
_start 0x000000000048804e

Reproducer script:

BEGIN;
DROP TABLE IF EXISTS item, incoming, source CASCADE;

CREATE TABLE item
  (order_id    INTEGER NOT NULL,
   item_id     INTEGER NOT NULL,
   quantity    INTEGER NOT NULL,
   price       NUMERIC NOT NULL,
   CONSTRAINT pk_item PRIMARY KEY (order_id, item_id));

INSERT INTO item VALUES (100, 1, 4, 100.00), (100, 2, 9, 199.00);

CREATE TABLE incoming (order_id, item_id, quantity, price)
  AS (VALUES (100, 1, 4, 100.00), (100, 3, 1, 200.00));

CREATE TABLE source (order_id, item_id, quantity, price) AS
  (SELECT order_id, item_id, incoming.quantity, incoming.price
     FROM item LEFT JOIN incoming USING (order_id, item_id));

MERGE INTO item a
USING source b
   ON (a.order_id, a.item_id) =
      (b.order_id, b.item_id)
 WHEN NOT MATCHED
 THEN INSERT (order_id, item_id, quantity, price)
      VALUES (order_id, item_id, quantity, price)
 WHEN MATCHED
  AND a.* IS DISTINCT FROM b.*
 THEN UPDATE SET (quantity, price) = (b.quantity, b.price)
 WHEN MATCHED
  AND (b.quantity IS NULL AND b.price IS NULL)
 THEN DELETE;
COMMIT;

It seems related to the use of a.* and b.*

That's right. The varattno is set to zero for whole-row Var. And in this
case these whole-row Vars are not included in the targetlist.

Attached is an attempt for the fix.

Thanks
Richard 
Вложения

Re: MERGE bug report

От
Alvaro Herrera
Дата:
On 2022-Apr-06, Richard Guo wrote:

> That's right. The varattno is set to zero for whole-row Var. And in this
> case these whole-row Vars are not included in the targetlist.
> 
> Attached is an attempt for the fix.

Wow, this is very interesting.  I was surprised that this patch was
necessary at all -- I mean, if wholerow refs don't work, then why do
references to any other columns work?  The answer is that parse_merge.c
is already setting up the subplan's targetlist by expanding all vars of
the source relation.  I then remembered than in Simon's (or Pavan's)
original coding, parse_merge.c had a hack to include a var with the
source's wholerow in that targetlist, which I had later removed ...

I eventually realized that there's no need for parse_merge.c to expand
the source rel at all, and indeed it's wasteful: we can just let
preprocess_targetlist include the vars that are referenced by either
quals or each action's targetlist instead.  That led me to the attached
patch, which is not commit-quality yet but it should show what I have in
mind.

I added a test query to tickle this problematic case.

Another point, not completely connected to this bug but appearing in the
same function, is that we have some redundant code: we can just let the
stanza for UPDATE/DELETE do the identity columns dance.  This saves a
few lines in the MERGE-specific stanza there, which was doing exactly
the same thing.  (There's a difference in the "inh" test, but I think
that was just outdated.)

I also discovered that the comment for fix_join_expr needed an update,
since it doesn't mention MERGE, and it does mention all other situations
in which it is used.  Added that too.


This patch is a comment about "aggregates, window functions and
placeholder vars".  This was relevant and correct when only the qual of
each action was being handled (i.e., Richard's patch).  Now that we're
also handling the action's targetlist, I think I need to put the PVC
flags back.  But no tests broke, which probably means we also need some
additional tests cases.

-- 
Álvaro Herrera         PostgreSQL Developer  —  https://www.EnterpriseDB.com/

Вложения

Re: MERGE bug report

От
Richard Guo
Дата:

On Sat, Apr 9, 2022 at 5:26 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
On 2022-Apr-06, Richard Guo wrote:

> That's right. The varattno is set to zero for whole-row Var. And in this
> case these whole-row Vars are not included in the targetlist.
>
> Attached is an attempt for the fix.

Wow, this is very interesting.  I was surprised that this patch was
necessary at all -- I mean, if wholerow refs don't work, then why do
references to any other columns work?  The answer is that parse_merge.c
is already setting up the subplan's targetlist by expanding all vars of
the source relation.  I then remembered than in Simon's (or Pavan's)
original coding, parse_merge.c had a hack to include a var with the
source's wholerow in that targetlist, which I had later removed ...

At first I was wondering whether we need to also include vars used in
each action's targetlist, just as what we did for each action's qual.
Then later I realized parse_merge.c already did that. But now it looks
much better to process them two in preprocess_targetlist.
 

I eventually realized that there's no need for parse_merge.c to expand
the source rel at all, and indeed it's wasteful: we can just let
preprocess_targetlist include the vars that are referenced by either
quals or each action's targetlist instead.  That led me to the attached
patch, which is not commit-quality yet but it should show what I have in
mind.

This patch looks in a good shape to me.

A minor comment is that we can use list_concat_copy(list1, list2)
instead of list_concat(list_copy(list1), list2) for better efficiency.

Thanks
Richard

Re: MERGE bug report

От
Alvaro Herrera
Дата:
On 2022-Apr-11, Richard Guo wrote:

> At first I was wondering whether we need to also include vars used in
> each action's targetlist, just as what we did for each action's qual.
> Then later I realized parse_merge.c already did that. But now it looks
> much better to process them two in preprocess_targetlist.

Yeah.  I pushed that.

However, now EXPLAIN VERBOSE doesn't show the columns from the source
relation in the Output line --- I think only those that are used as join
quals are shown, thanks to distribute_quals_to_rels.  I think it would
be better to fix this.  Maybe expanding the source target list earlier
is called for, after all.  I looked at transformUpdateStmt and siblings
for inspiration, but came out blank.

> A minor comment is that we can use list_concat_copy(list1, list2)
> instead of list_concat(list_copy(list1), list2) for better efficiency.

Thanks for that tip.

-- 
Álvaro Herrera               48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"La vida es para el que se aventura"