Обсуждение: Bug in query rewriter - hasModifyingCTE not getting set

Поиск
Список
Период
Сортировка

Bug in query rewriter - hasModifyingCTE not getting set

От
Greg Nancarrow
Дата:
Hi Hackers,

I found a bug in the query rewriter. If a query that has a modifying
CTE is re-written, the hasModifyingCTE flag is not getting set in the
re-written query. This bug can result in the query being allowed to
execute in parallel-mode, which results in an error.

I originally found the problem using INSERT (which doesn't actually
affect the current Postgres code, as it doesn't support INSERT in
parallel mode) but a colleague of mine (Hou, Zhijie) managed to
reproduce it using SELECT as well (see example below), and helped to
minimize the patch size.

I've attached the patch with the suggested fix (reviewed by Amit Langote).


The following reproduces the issue (adapted from a test case in the
"with" regression tests):

drop table if exists test_data1;
create table test_data1(a int, b int) ;
insert into test_data1 select generate_series(1,1000), generate_series(1,1000);
set force_parallel_mode=on;
CREATE TEMP TABLE bug6051 AS
select i from generate_series(1,3) as i;
SELECT * FROM bug6051;
CREATE RULE bug6051_ins AS ON INSERT TO bug6051 DO INSTEAD select a as
i from test_data1;
WITH t1 AS ( DELETE FROM bug6051 RETURNING * ) INSERT INTO bug6051
SELECT * FROM t1;

produces the error:

    ERROR:  cannot assign XIDs during a parallel operation


Regards,
Greg Nancarrow
Fujitsu Australia

Вложения

Re: Bug in query rewriter - hasModifyingCTE not getting set

От
Tom Lane
Дата:
Greg Nancarrow <gregn4422@gmail.com> writes:
> I found a bug in the query rewriter. If a query that has a modifying
> CTE is re-written, the hasModifyingCTE flag is not getting set in the
> re-written query.

Ugh.

> I've attached the patch with the suggested fix (reviewed by Amit Langote).

I think either the bit about rule_action is unnecessary, or most of
the code immediately above this is wrong, because it's only updating
flags in sub_action.  Why do you think it's necessary to change
rule_action in addition to sub_action?

            regards, tom lane



Re: Bug in query rewriter - hasModifyingCTE not getting set

От
Tom Lane
Дата:
After poking around a bit more, I notice that the hasRecursive flag
really ought to get propagated as well, since that's also an attribute
of the CTE list.  That omission doesn't seem to have any ill effect
today, since nothing in planning or execution looks at that flag, but
someday it might.  So what I think we should do is as attached.
(I re-integrated your example into with.sql, too.)

Given the very limited time remaining before the release wrap, I'm
going to go ahead and push this.

            regards, tom lane

diff --git a/src/backend/rewrite/rewriteHandler.c b/src/backend/rewrite/rewriteHandler.c
index 0672f497c6..ba6f8a0507 100644
--- a/src/backend/rewrite/rewriteHandler.c
+++ b/src/backend/rewrite/rewriteHandler.c
@@ -536,6 +536,9 @@ rewriteRuleAction(Query *parsetree,
          *
          * This could possibly be fixed by using some sort of internally
          * generated ID, instead of names, to link CTE RTEs to their CTEs.
+         * However, decompiling the results would be quite confusing; note the
+         * merge of hasRecursive flags below, which could change the apparent
+         * semantics of such redundantly-named CTEs.
          */
         foreach(lc, parsetree->cteList)
         {
@@ -557,6 +560,9 @@ rewriteRuleAction(Query *parsetree,
         /* OK, it's safe to combine the CTE lists */
         sub_action->cteList = list_concat(sub_action->cteList,
                                           copyObject(parsetree->cteList));
+        /* ... and don't forget about the associated flags */
+        sub_action->hasRecursive |= parsetree->hasRecursive;
+        sub_action->hasModifyingCTE |= parsetree->hasModifyingCTE;
     }

     /*
diff --git a/src/test/regress/expected/with.out b/src/test/regress/expected/with.out
index c519a61c4f..5e4ea29243 100644
--- a/src/test/regress/expected/with.out
+++ b/src/test/regress/expected/with.out
@@ -2277,6 +2277,33 @@ SELECT * FROM bug6051_2;
  3
 (3 rows)

+-- silly example to verify that hasModifyingCTE flag is propagated
+CREATE TEMP TABLE bug6051_3 AS
+  select a from generate_series(11,13) as a;
+CREATE RULE bug6051_3_ins AS ON INSERT TO bug6051_3 DO INSTEAD
+  SELECT i FROM bug6051_2;
+BEGIN; SET LOCAL force_parallel_mode = on;
+WITH t1 AS ( DELETE FROM bug6051_3 RETURNING * )
+  INSERT INTO bug6051_3 SELECT * FROM t1;
+ i
+---
+ 1
+ 2
+ 3
+ 1
+ 2
+ 3
+ 1
+ 2
+ 3
+(9 rows)
+
+COMMIT;
+SELECT * FROM bug6051_3;
+ a
+---
+(0 rows)
+
 -- a truly recursive CTE in the same list
 WITH RECURSIVE t(a) AS (
     SELECT 0
diff --git a/src/test/regress/sql/with.sql b/src/test/regress/sql/with.sql
index f4ba0d8e39..0ffa44afa7 100644
--- a/src/test/regress/sql/with.sql
+++ b/src/test/regress/sql/with.sql
@@ -1070,6 +1070,22 @@ INSERT INTO bug6051 SELECT * FROM t1;
 SELECT * FROM bug6051;
 SELECT * FROM bug6051_2;

+-- silly example to verify that hasModifyingCTE flag is propagated
+CREATE TEMP TABLE bug6051_3 AS
+  select a from generate_series(11,13) as a;
+
+CREATE RULE bug6051_3_ins AS ON INSERT TO bug6051_3 DO INSTEAD
+  SELECT i FROM bug6051_2;
+
+BEGIN; SET LOCAL force_parallel_mode = on;
+
+WITH t1 AS ( DELETE FROM bug6051_3 RETURNING * )
+  INSERT INTO bug6051_3 SELECT * FROM t1;
+
+COMMIT;
+
+SELECT * FROM bug6051_3;
+
 -- a truly recursive CTE in the same list
 WITH RECURSIVE t(a) AS (
     SELECT 0

Re: Bug in query rewriter - hasModifyingCTE not getting set

От
Greg Nancarrow
Дата:
On Sun, Feb 7, 2021 at 10:03 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Greg Nancarrow <gregn4422@gmail.com> writes:
> > I found a bug in the query rewriter. If a query that has a modifying
> > CTE is re-written, the hasModifyingCTE flag is not getting set in the
> > re-written query.
>
> Ugh.
>
> > I've attached the patch with the suggested fix (reviewed by Amit Langote).
>
> I think either the bit about rule_action is unnecessary, or most of
> the code immediately above this is wrong, because it's only updating
> flags in sub_action.  Why do you think it's necessary to change
> rule_action in addition to sub_action?
>

I believe that the bit about rule_action IS necessary, as it's needed
for the case of INSERT...SELECT, so that hasModifyingCTE is set on the
rewritten INSERT (see comment above the call to
getInsertSelectQuery(), and the "KLUDGE ALERT" comment within that
function).

In the current Postgres code, it doesn't let INSERT run in
parallel-mode (only SELECT), but in the debugger you can clearly see
that for an INSERT with a subquery that uses a modifying CTE, the
hasModifyingCTE flag is not getting set on the rewritten INSERT query
by the query rewriter. As I've been working on parallel INSERT, I
found the issue first for INSERT (one test failure in the "with" tests
when force_parallel_mode=regress).

Here's some silly SQL (very similar to existing test case in the
"with" tests) to reproduce the issue for INSERT (as I said, it won't
give an error like the SELECT case, as currently INSERT is not allowed
in parallel-mode anyway, but the issue can be seen in the debugger):

set force_parallel_mode=on;
CREATE TABLE bug6051 AS
  select i from generate_series(1,3) as i;
SELECT * FROM bug6051;
CREATE TABLE bug6051_2 (i int);
CREATE RULE bug6051_ins AS ON INSERT TO bug6051 DO INSTEAD
 INSERT INTO bug6051_2
 SELECT NEW.i;
WITH t1 AS ( DELETE FROM bug6051 RETURNING * )
INSERT INTO bug6051 SELECT * FROM t1;


Regards,
Greg Nancarrow
Fujitsu Australia



Re: Bug in query rewriter - hasModifyingCTE not getting set

От
Tom Lane
Дата:
Greg Nancarrow <gregn4422@gmail.com> writes:
> On Sun, Feb 7, 2021 at 10:03 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> I think either the bit about rule_action is unnecessary, or most of
>> the code immediately above this is wrong, because it's only updating
>> flags in sub_action.  Why do you think it's necessary to change
>> rule_action in addition to sub_action?

> I believe that the bit about rule_action IS necessary, as it's needed
> for the case of INSERT...SELECT, so that hasModifyingCTE is set on the
> rewritten INSERT (see comment above the call to
> getInsertSelectQuery(), and the "KLUDGE ALERT" comment within that
> function).

Hm.  So after looking at this more, the problem is that the rewrite
is producing something equivalent to

INSERT INTO bug6051_2
(WITH t1 AS (DELETE FROM bug6051 RETURNING *) SELECT * FROM t1);

If you try to do that directly, the parser will give you the raspberry:

ERROR:  WITH clause containing a data-modifying statement must be at the top level
LINE 2: (WITH t1 AS (DELETE FROM bug6051 RETURNING *) SELECT * FROM ...
              ^

The code throwing that error, in analyzeCTE(), explains

    /*
     * We disallow data-modifying WITH except at the top level of a query,
     * because it's not clear when such a modification should be executed.
     */

That semantic issue doesn't get any less pressing just because the query
was generated by rewrite.  So I now think that what we have to do is
throw an error if we have a modifying CTE and sub_action is different
from rule_action.  Not quite sure how to phrase the error though.

In view of this, maybe the right thing is to disallow modifying CTEs
in rule actions in the first place.  I see we already do that for
views (i.e. ON SELECT rules), but they're not really any safer in
other types of rules.  Given that non-SELECT rules are an undertested
legacy thing, I'm not that excited about moving mountains to make
this case possible.

Anyway, I think I'm going to go revert the patch I crammed in last night.
There's more here than meets the eye, and right before a release is no
time to be fooling with an issue that's been there for years.

            regards, tom lane



Re: Bug in query rewriter - hasModifyingCTE not getting set

От
Tom Lane
Дата:
I wrote:
> That semantic issue doesn't get any less pressing just because the query
> was generated by rewrite.  So I now think that what we have to do is
> throw an error if we have a modifying CTE and sub_action is different
> from rule_action.  Not quite sure how to phrase the error though.

Another idea that'd avoid disallowing functionality is to try to attach
the CTEs to the rule_action not the sub_action.  This'd require adjusting
ctelevelsup in appropriate parts of the parsetree when those are
different, so it seems like it'd be a pain.  I remain unconvinced that
it's worth it.

            regards, tom lane



RE: Bug in query rewriter - hasModifyingCTE not getting set

От
"tsunakawa.takay@fujitsu.com"
Дата:
From: Tom Lane <tgl@sss.pgh.pa.us>
> In view of this, maybe the right thing is to disallow modifying CTEs
> in rule actions in the first place.  I see we already do that for
> views (i.e. ON SELECT rules), but they're not really any safer in
> other types of rules.

You meant by views something like the following, didn't you?

postgres=# create view myview as with t as (delete from b) select * from a;
ERROR:  views must not contain data-modifying statements in WITH

OTOH, the examples Greg-san showed do not contain CTE in the rule action, but in the query that the rule is applied to.
So, I think the solution would be something different. 


>  Given that non-SELECT rules are an undertested
> legacy thing, I'm not that excited about moving mountains to make
> this case possible.

> That semantic issue doesn't get any less pressing just because the query
> was generated by rewrite.  So I now think that what we have to do is
> throw an error if we have a modifying CTE and sub_action is different
> from rule_action.  Not quite sure how to phrase the error though.

So, how about just throwing an error when the original query (not the rule action) has a data-modifying CTE?  The error
messagewould be something like "a query containing a data-modifying CTE cannot be executed because there is some rule
applicableto the relation".  This may be overkill and too many regression tests might fail, so we may have to add some
conditionto determine if we error out. 

Or, I thought Greg-san's patch would suffice.  What problem do you see in it?

I couldn't imagine what "mountains" are.  Could you tell me what's that?


Regards
Takayuki Tsunakawa




RE: Bug in query rewriter - hasModifyingCTE not getting set

От
"houzj.fnst@fujitsu.com"
Дата:
From: Tom Lane <tgl@sss.pgh.pa.us>
> Greg Nancarrow <gregn4422@gmail.com> writes:
> > On Sun, Feb 7, 2021 at 10:03 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >> I think either the bit about rule_action is unnecessary, or most of
> >> the code immediately above this is wrong, because it's only updating
> >> flags in sub_action.  Why do you think it's necessary to change
> >> rule_action in addition to sub_action?
>
> > I believe that the bit about rule_action IS necessary, as it's needed
> > for the case of INSERT...SELECT, so that hasModifyingCTE is set on the
> > rewritten INSERT (see comment above the call to
> > getInsertSelectQuery(), and the "KLUDGE ALERT" comment within that
> > function).
>
> Hm.  So after looking at this more, the problem is that the rewrite is producing
> something equivalent to
>
> INSERT INTO bug6051_2
> (WITH t1 AS (DELETE FROM bug6051 RETURNING *) SELECT * FROM t1);
>
> If you try to do that directly, the parser will give you the raspberry:
>
> ERROR:  WITH clause containing a data-modifying statement must be at the
> top level LINE 2: (WITH t1 AS (DELETE FROM bug6051 RETURNING *) SELECT *
> FROM ...
>               ^
>
> The code throwing that error, in analyzeCTE(), explains
>
>     /*
>      * We disallow data-modifying WITH except at the top level of a query,
>      * because it's not clear when such a modification should be executed.
>      */
>
> That semantic issue doesn't get any less pressing just because the query was
> generated by rewrite.  So I now think that what we have to do is throw an error
> if we have a modifying CTE and sub_action is different from rule_action.  Not
> quite sure how to phrase the error though.

I am +1 for throwing an error if we have a modifying CTE and sub_action is different
from rule_action. As we disallowed data-modifying CTEs which is not at the top level
of a query, it will be safe and consistent to disallow the same case here.

Maybe we can output the message like the following ?
"DO INSTEAD INSERT ... SELECT rules are not supported for INSERT contains data-modifying statements in WITH."

Best regards,
houzj




RE: Bug in query rewriter - hasModifyingCTE not getting set

От
"tsunakawa.takay@fujitsu.com"
Дата:
From: Tom Lane <tgl@sss.pgh.pa.us>
> I think either the bit about rule_action is unnecessary, or most of
> the code immediately above this is wrong, because it's only updating
> flags in sub_action.  Why do you think it's necessary to change
> rule_action in addition to sub_action?

Finally, I think I've understood what you meant.  Yes, the current code seems to be wrong.  rule_action is different
fromsub_action only when the rule action (the query specified in CREATE RULE) is INSERT SELECT.  In that case,
rule_actionpoints to the entire INSERT SELECT, while sub_action points to the SELECT part.  So, we should add the CTE
listand set hasModifyingCTE/hasRecursive flags in rule_action. 


> Hm.  So after looking at this more, the problem is that the rewrite
> is producing something equivalent to
>
> INSERT INTO bug6051_2
> (WITH t1 AS (DELETE FROM bug6051 RETURNING *) SELECT * FROM t1);

Yes.  In this case, the WITH clause must be put before INSERT.

The attached patch is based on your version.  It includes cosmetic changes to use = instead of |= for boolean variable
assignments. make check passed.  Also, Greg-san's original failed test case succeeded.  I confirmed that the
hasModifyingCTEof the top-level rewritten query is set to true now by looking at the output of debug_print_rewritten
anddebug_print_plan. 


Regards
Takayuki Tsunakawa


Вложения

Re: Bug in query rewriter - hasModifyingCTE not getting set

От
Tom Lane
Дата:
"tsunakawa.takay@fujitsu.com" <tsunakawa.takay@fujitsu.com> writes:
> From: Tom Lane <tgl@sss.pgh.pa.us>
>> I think either the bit about rule_action is unnecessary, or most of
>> the code immediately above this is wrong, because it's only updating
>> flags in sub_action.  Why do you think it's necessary to change
>> rule_action in addition to sub_action?

> Finally, I think I've understood what you meant.  Yes, the current code seems to be wrong.

I'm fairly skeptical of this claim, because that code has stood for a
long time.  Can you provide an example (not involving hasModifyingCTE)
in which it's wrong?

            regards, tom lane



RE: Bug in query rewriter - hasModifyingCTE not getting set

От
"tsunakawa.takay@fujitsu.com"
Дата:
From: Tom Lane <tgl@sss.pgh.pa.us>
> "tsunakawa.takay@fujitsu.com" <tsunakawa.takay@fujitsu.com> writes:
> > Finally, I think I've understood what you meant.  Yes, the current code seems
> to be wrong.
>
> I'm fairly skeptical of this claim, because that code has stood for a
> long time.  Can you provide an example (not involving hasModifyingCTE)
> in which it's wrong?

Hmm, I don't think of an example.  I wonder if attaching WITH before INSERT SELECT and putting WITH between INSERT and
SELECTproduce the same results.  Maybe that's why the regression test succeeds with the patch. 

To confirm, the question is that when we have the following rule in place and the client issues the query:

[rule]
CREATE RULE myrule AS
    ON {INSERT | UPDATE | DELETE} TO orig_table
    DO INSTEAD
        INSERT INTO some_table SELECT ...;

[original query]
WITH t AS (
    SELECT and/or NOTIFY
)
{INSERT INTO | UPDATE | DELETE FROM} orig_table ...;

which of the following two queries do we expect?

[generated query 1]
WITH t AS (
    SELECT and/or NOTIFY
)
        INSERT INTO some_table SELECT ...;

[generated query 2]
        INSERT INTO some_table
WITH t AS (
    SELECT and/or NOTIFY
)
SELECT ...;

Although both may produce the same results, I naturally expected query 1, because WITH was originally attached before
thetop-level query, and (2) the top-level query has been replaced with a rule action, so it's natural that the WITH is
attachedbefore the rule action.  Super-abbreviated description is: 

    x -> y  (rule)
    WITH t x  (original query)
    WITH t y  (generated query 1)
    one-part-of-y WITH t another-part-of-y  (generated query 2)

As we said, we agree to fail the query if it's the above generated query 2 and WITH contains a data-modyfing CTE, if we
cannotbe confident to accept the change to the WITH position.  Which do you think we want to choose? 


Regards
Takayuki Tsunakawa




Re: Bug in query rewriter - hasModifyingCTE not getting set

От
Tom Lane
Дата:
"tsunakawa.takay@fujitsu.com" <tsunakawa.takay@fujitsu.com> writes:
> The attached patch is based on your version.  It includes cosmetic
> changes to use = instead of |= for boolean variable assignments.

I think that's less "cosmetic" than "gratuitous breakage".  The point
here is that we are combining two rtables, so the query had better
end up with flags that describe the union of the rtables' properties.
Our regression tests are unfortunately not very thorough in this area,
so it doesn't surprise me that they fail to fall over.

After thinking about it for awhile, I'm okay with the concept of
attaching the source query's CTEs to the parent rule_action so far
as the semantics are concerned.  But this patch fails to implement
that correctly.  If we're going to do it like that, then the
ctelevelsup fields of any CTE RTEs that refer to those CTEs have
to be incremented when rule_action is different from sub_action,
because the CTEs are getting attached one level higher in the
query nest than the referencing RTEs are.  The proposed test case
fails to expose this, because the rule action isn't INSERT/SELECT,
so the case of interest isn't being exercised at all.  However,
it's harder than you might think to demonstrate a problem ---
I first tried

CREATE RULE bug6051_3_ins AS ON INSERT TO bug6051_3 DO INSTEAD
  INSERT INTO bug6051_2 SELECT a FROM bug6051_3;

and that failed to fall over with the patch.  Turns out that's
because the SELECT part is simple enough to be pulled up, and
the pull-up moves the CTE that's been put into it one level
higher, causing it to accidentally have the correct ctelevelsup
anyway.  If you use an INSERT with a non-pull-up-able SELECT
then you can see the problem: this script

CREATE TEMP TABLE bug6051_2 (i int);

CREATE TEMP TABLE bug6051_3 AS
  select a from generate_series(11,13) as a;

CREATE RULE bug6051_3_ins AS ON INSERT TO bug6051_3 DO INSTEAD
  INSERT INTO bug6051_2 SELECT sum(a) FROM bug6051_3;

explain verbose
WITH t1 AS ( DELETE FROM bug6051_3 RETURNING * )
  INSERT INTO bug6051_3 SELECT * FROM t1;

causes the patch to fail with

ERROR:  could not find CTE "t1"

Now, we could potentially make this work if we wrote code to run
through the copied rtable entries (recursively) and increment the
appropriate ctelevelsup fields by one.  That would essentially
have to be a variant of IncrementVarSublevelsUp that *only* acts
on ctelevelsup and not other level-dependent fields.  That's
what I meant when I spoke of moving mountains: the amount of code
that would need to go into this seems out of all proportion to
the value.  I think we should just throw an error, instead.
At least till such time as we see actual field complaints.

            regards, tom lane



Re: Bug in query rewriter - hasModifyingCTE not getting set

От
Greg Nancarrow
Дата:

On Wed, Sep 8, 2021 at 8:00 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> "tsunakawa.takay@fujitsu.com" <tsunakawa.takay@fujitsu.com> writes:
> > The attached patch is based on your version.  It includes cosmetic
> > changes to use = instead of |= for boolean variable assignments.
>
> Now, we could potentially make this work if we wrote code to run
> through the copied rtable entries (recursively) and increment the
> appropriate ctelevelsup fields by one.  That would essentially
> have to be a variant of IncrementVarSublevelsUp that *only* acts
> on ctelevelsup and not other level-dependent fields.  That's
> what I meant when I spoke of moving mountains: the amount of code
> that would need to go into this seems out of all proportion to
> the value.  I think we should just throw an error, instead.
> At least till such time as we see actual field complaints.
>

[I don't think Tsunakawa-san will be responding to this any time soon]

I proposed a patch for this issue in a separate thread:
https://www.postgresql.org/message-id/CAJcOf-f68DT=26YAMz_i0+Au3TcLO5oiHY5=fL6Sfuits6r+_w@mail.gmail.com

The patch takes your previously-reverted patch for this issue and adds an error condition, so it does throw an error for that test case in your previous post.
It also affects one existing regression test, since that uses an INSERT...SELECT rule action applied to a command with a data-modifying CTE (and we shouldn't really be allowing that anyway).


Regards,
Greg Nancarrow
Fujitsu Australia

Re: Bug in query rewriter - hasModifyingCTE not getting set

От
Tom Lane
Дата:
Greg Nancarrow <gregn4422@gmail.com> writes:
> On Wed, Sep 8, 2021 at 8:00 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Now, we could potentially make this work if we wrote code to run
>> through the copied rtable entries (recursively) and increment the
>> appropriate ctelevelsup fields by one.  That would essentially
>> have to be a variant of IncrementVarSublevelsUp that *only* acts
>> on ctelevelsup and not other level-dependent fields.  That's
>> what I meant when I spoke of moving mountains: the amount of code
>> that would need to go into this seems out of all proportion to
>> the value.  I think we should just throw an error, instead.
>> At least till such time as we see actual field complaints.

> [I don't think Tsunakawa-san will be responding to this any time soon]

Oh!  I'd not realized that he'd dropped out of the community, but
checking my mail folder, I don't see any messages from him in months
... and his email address is bouncing, too.  Too bad.

> I proposed a patch for this issue in a separate thread:
> https://www.postgresql.org/message-id/CAJcOf-f68DT=26YAMz_i0+Au3TcLO5oiHY5=fL6Sfuits6r+_w@mail.gmail.com

Right, that one looks like an appropriate amount of effort
(at least till someone gets way more excited about the case
than I am).  I will mark this CF item Returned With Feedback
and go see about that one.

            regards, tom lane