Обсуждение: Assert !bms_overlap(joinrel->relids, required_outer)

Поиск
Список
Период
Сортировка

Assert !bms_overlap(joinrel->relids, required_outer)

От
Jaime Casanova
Дата:
Hi,

The attached query makes beta2 crash with attached backtrace.
Interestingly the index on ref_6 is needed to make it crash, without
it the query works fine.

-- 
Jaime Casanova
Director de Servicios Profesionales
SYSTEMGUARDS - Consultores de PostgreSQL

Вложения

Re: Assert !bms_overlap(joinrel->relids, required_outer)

От
Michael Paquier
Дата:
On Mon, Jun 26, 2023 at 11:05:43PM -0500, Jaime Casanova wrote:
> The attached query makes beta2 crash with attached backtrace.
> Interestingly the index on ref_6 is needed to make it crash, without
> it the query works fine.

Issue reproduced here.  I am adding an open item, whose owner should
be Tom?
--
Michael

Вложения

Re: Assert !bms_overlap(joinrel->relids, required_outer)

От
Richard Guo
Дата:

On Tue, Jun 27, 2023 at 1:35 PM Michael Paquier <michael@paquier.xyz> wrote:
On Mon, Jun 26, 2023 at 11:05:43PM -0500, Jaime Casanova wrote:
> The attached query makes beta2 crash with attached backtrace.
> Interestingly the index on ref_6 is needed to make it crash, without
> it the query works fine.

Issue reproduced here.  I am adding an open item, whose owner should
be Tom?

That's right.  This issue has something to do with the
outer-join-aware-Var changes.  I reduced the repro to the query below.

create table t (a int);
create index on t(a);

explain (costs off)
select 1 from t t1
         join lateral
           (select t1.a from (select 1) foo offset 0) s1 on true
         join
            (select 1 from t t2
                inner join t t3
                 left join t t4 left join t t5 on t4.a = 1
                on t4.a = 1 on false
             where t3.a = coalesce(t5.a,1)) as s2
          on true;

When joining s1/t3 to t4, the relid of outer join t3/t4 appears both in
the joinrel's relids and in the joinrel's required outer rels, which
causes the Assert failure.  I think it's reasonable for it to appear in
the joinrel's relids, because we're forming this outer join.  I doubt
that it should appear in the joinrel's required outer rels.  So I'm
wondering if we can fix this issue by manually removing the outer join's
relid from the joinrel's required_outer, something like:

 if (bms_is_member(extra->sjinfo->ojrelid, joinrel->relids))
     required_outer = bms_del_member(required_outer, extra->sjinfo->ojrelid);

This would be needed in try_nestloop_path, try_mergejoin_path and
try_hashjoin_path after the required_outer set is computed for the join
path.  It seems quite hacky though, not sure if this is the right thing
to do.

Thanks
Richard

Re: Assert !bms_overlap(joinrel->relids, required_outer)

От
Tom Lane
Дата:
Richard Guo <guofenglinux@gmail.com> writes:
> That's right.  This issue has something to do with the
> outer-join-aware-Var changes.  I reduced the repro to the query below.

Thanks for the simplified test case.

> When joining s1/t3 to t4, the relid of outer join t3/t4 appears both in
> the joinrel's relids and in the joinrel's required outer rels, which
> causes the Assert failure.  I think it's reasonable for it to appear in
> the joinrel's relids, because we're forming this outer join.  I doubt
> that it should appear in the joinrel's required outer rels.

It looks to me like we are trying to join (2 7), that is s1 and t3,
to 8 (t4), which would necessitate forming the outer join with relid 11.
That's fine as far as it goes, but the path we're trying to use for
(2 7) is

   {NESTPATH 
   :jpath.path.pathtype 335 
   :parent_relids (b 2 7)
   :required_outer (b 1 9 10 11)
   :jpath.outerjoinpath 
      {SUBQUERYSCANPATH 
      :path.pathtype 326 
      :parent_relids (b 2)
      :required_outer (b 1)
   :jpath.innerjoinpath 
      {INDEXPATH 
      :path.pathtype 321 
      :parent_relids (b 7)  t3
      :required_outer (b 9 10 11) t5 and both outer joins

That is, the path involves an indexscan on t3 that evidently is using
the "t3.a = coalesce(t5.a,1)" condition, so it needs a post-join value
of t5.a.  So it's completely not legit to use this path as an input
for this join.  (You could quibble about whether the path could be
marked as needing only one of the two outer joins, but that doesn't
really matter here.  It certainly shouldn't be used when we've not
yet formed either OJ.)

So it looks to me like something further up should have rejected this
path as not being usable here.  Not sure what's dropping the ball.

Another way to look at it is we should never have formed this index
path at all, because it's not clear to me that it can have any valid
use.  We clearly cannot form OJ 11 (t3/t4) without having already
scanned t3, so a path for t3 that requires 11 as an input is silly on
its face.  Even if you argue that the required_outer marking for the
path could be reduced to (9 10) on the grounds of identity 3, I still
don't see a valid join order that can use this path.  So ideally the
path wouldn't have been made in the first place, it's just a waste
of planner cycles.  That's a separate issue though.

            regards, tom lane



Re: Assert !bms_overlap(joinrel->relids, required_outer)

От
Tom Lane
Дата:
I wrote:
> So it looks to me like something further up should have rejected this
> path as not being usable here.  Not sure what's dropping the ball.

After further digging, I've concluded that what usually stops us
from generating this bogus path for the t3/t4 join is the
"param_source_rels" heuristic in joinpath.c.  It does stop it in
the simplified query

select 1 from t t2
                inner join (t t3
                 left join (t t4 left join t t5 on t4.a = 1)
                on t4.a = 1) on false
             where t3.a = coalesce(t5.a,1);

However, once we add the lateral reference in s1, that heuristic
uncritically lets the path go through, and then later we have trouble.
Of course, that heuristic is only supposed to be a heuristic that
helps winnow valid paths, not a defense against invalid paths,
so it's not its fault that this goes wrong.  (I think that the old
delay_upper_joins mechanism is what prevented this error before v16.)

For a real fix, I'm inclined to extend the loop that calculates
param_source_rels (in add_paths_to_joinrel) so that it also tracks
a set of incompatible relids that *must not* be present in the
parameterization of a proposed path.  This would basically include
OJ relids of OJs that partially overlap the target joinrel; maybe
we should also include the min RHS of such OJs.  Then we could
check that in try_nestloop_path.  I've not tried to code this yet.

There's also the question of why we generated the bogus indexscan
in the first place; but it seems advisable to fix the join-level
issue before touching that, else we'll have nothing to test with.

            regards, tom lane



Re: Assert !bms_overlap(joinrel->relids, required_outer)

От
Richard Guo
Дата:

On Tue, Jun 27, 2023 at 10:12 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Richard Guo <guofenglinux@gmail.com> writes:
> That's right.  This issue has something to do with the
> outer-join-aware-Var changes.  I reduced the repro to the query below.

Thanks for the simplified test case.

> When joining s1/t3 to t4, the relid of outer join t3/t4 appears both in
> the joinrel's relids and in the joinrel's required outer rels, which
> causes the Assert failure.  I think it's reasonable for it to appear in
> the joinrel's relids, because we're forming this outer join.  I doubt
> that it should appear in the joinrel's required outer rels.

It looks to me like we are trying to join (2 7), that is s1 and t3,
to 8 (t4), which would necessitate forming the outer join with relid 11.
That's fine as far as it goes, but the path we're trying to use for
(2 7) is

   {NESTPATH
   :jpath.path.pathtype 335
   :parent_relids (b 2 7)
   :required_outer (b 1 9 10 11)
   :jpath.outerjoinpath
      {SUBQUERYSCANPATH
      :path.pathtype 326
      :parent_relids (b 2)
      :required_outer (b 1)
   :jpath.innerjoinpath
      {INDEXPATH
      :path.pathtype 321
      :parent_relids (b 7)  t3
      :required_outer (b 9 10 11) t5 and both outer joins

That is, the path involves an indexscan on t3 that evidently is using
the "t3.a = coalesce(t5.a,1)" condition, so it needs a post-join value
of t5.a.  So it's completely not legit to use this path as an input
for this join.  (You could quibble about whether the path could be
marked as needing only one of the two outer joins, but that doesn't
really matter here.  It certainly shouldn't be used when we've not
yet formed either OJ.)

I tried this query on v15 and found that we'd also generate this bogus
path for the t3/t4 join.

   {NESTPATH
   :pathtype 38
   :parent_relids (b 2 7)
   :required_outer (b 1 9)
   :outerjoinpath
      {SUBQUERYSCANPATH
      :pathtype 28
      :parent_relids (b 2)
      :required_outer (b 1)
   :innerjoinpath
      {INDEXPATH
      :pathtype 23
      :parent_relids (b 7)  t3
      :required_outer (b 9)  t5

The Assert failure is not seen on v15 because outer join relids are not
included in joinrel's relids and required_outer sets.

Thanks
Richard

Re: Assert !bms_overlap(joinrel->relids, required_outer)

От
Richard Guo
Дата:

On Wed, Jun 28, 2023 at 6:28 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
For a real fix, I'm inclined to extend the loop that calculates
param_source_rels (in add_paths_to_joinrel) so that it also tracks
a set of incompatible relids that *must not* be present in the
parameterization of a proposed path.  This would basically include
OJ relids of OJs that partially overlap the target joinrel; maybe
we should also include the min RHS of such OJs.  Then we could
check that in try_nestloop_path.  I've not tried to code this yet.

I went ahead and drafted a patch based on this idea.  A little
differences include

* You mentioned that the incompatible relids might need to also include
the min_righthand of the OJs that are part of the target joinrel.  It
seems to me that we may need to also include the min_lefthand of such
OJs, because the parameterization of any proposed join path for the
target joinrel should not overlap anything in an OJ if the OJ is part of
this joinrel.

* I think we need to check the incompatible relids also in
try_hashjoin_path and try_mergejoin_path besides try_nestloop_path.

Thanks
Richard
Вложения

Re: Assert !bms_overlap(joinrel->relids, required_outer)

От
Tom Lane
Дата:
Richard Guo <guofenglinux@gmail.com> writes:
> On Wed, Jun 28, 2023 at 6:28 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> For a real fix, I'm inclined to extend the loop that calculates
>> param_source_rels (in add_paths_to_joinrel) so that it also tracks
>> a set of incompatible relids that *must not* be present in the
>> parameterization of a proposed path.  This would basically include
>> OJ relids of OJs that partially overlap the target joinrel; maybe
>> we should also include the min RHS of such OJs.  Then we could
>> check that in try_nestloop_path.  I've not tried to code this yet.

> I went ahead and drafted a patch based on this idea.

Hmm.  This patch is the opposite of what I'd been imagining, because
I was thinking we needed to add OJs to param_incompatible_relids if
they were *not* already in the join, rather than if they were.
However, I tried it like that and while it did stop the assertion
failure, it also broke a bunch of other test cases that no longer
found the parameterized-nestloop plans they were supposed to find.
So clearly I just didn't have my head screwed on in the correct
direction yesterday.

However, given that what we need is to exclude parameterization
that depends on the currently-formed OJ, it seems to me we can do
it more simply and without any new JoinPathExtraData field,
as attached.  What do you think?

> * I think we need to check the incompatible relids also in
> try_hashjoin_path and try_mergejoin_path besides try_nestloop_path.

I think this isn't necessary, at least in my formulation.
Those cases will go through calc_non_nestloop_required_outer
which has

    /* neither path can require rels from the other */
    Assert(!bms_overlap(outer_paramrels, inner_path->parent->relids));
    Assert(!bms_overlap(inner_paramrels, outer_path->parent->relids));

In order to have a dependency on an OJ, a path would have to have
a dependency on at least one of the OJ's base relations too, so
I think these assertions show that the case won't arise.  (Of
course, if someone can trip one of these assertions, I'm wrong.)

            regards, tom lane

diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c
index c2f211a60d..4b6ed6e312 100644
--- a/src/backend/optimizer/path/joinpath.c
+++ b/src/backend/optimizer/path/joinpath.c
@@ -698,6 +698,17 @@ try_nestloop_path(PlannerInfo *root,
     Relids        inner_paramrels = PATH_REQ_OUTER(inner_path);
     Relids        outer_paramrels = PATH_REQ_OUTER(outer_path);

+    /*
+     * If we are forming an outer join at this join, it's nonsensical to use
+     * an input path that uses the outer join as part of its parameterization.
+     * (This can happen despite our join order restrictions, since those apply
+     * to what is in an input relation not what its parameters are.)
+     */
+    if (extra->sjinfo && extra->sjinfo->ojrelid != 0 &&
+        (bms_is_member(extra->sjinfo->ojrelid, outer_paramrels) ||
+         bms_is_member(extra->sjinfo->ojrelid, inner_paramrels)))
+        return;
+
     /*
      * Paths are parameterized by top-level parents, so run parameterization
      * tests on the parent relids.
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 6917faec14..12b828fae3 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -5063,6 +5063,37 @@ select 1 from
 ----------
 (0 rows)

+explain (costs off)
+select 1 from tenk1 t1
+         join lateral
+           (select t1.unique1 from (select 1) foo offset 0) s1 on true
+         join
+            (select 1 from tenk1 t2
+                inner join tenk1 t3
+                 left join tenk1 t4 left join tenk1 t5 on t4.unique1 = 1
+                on t4.unique1 = 1 on false
+             where t3.unique1 = coalesce(t5.unique1,1)) as s2
+          on true;
+        QUERY PLAN
+--------------------------
+ Result
+   One-Time Filter: false
+(2 rows)
+
+select 1 from tenk1 t1
+         join lateral
+           (select t1.unique1 from (select 1) foo offset 0) s1 on true
+         join
+            (select 1 from tenk1 t2
+                inner join tenk1 t3
+                 left join tenk1 t4 left join tenk1 t5 on t4.unique1 = 1
+                on t4.unique1 = 1 on false
+             where t3.unique1 = coalesce(t5.unique1,1)) as s2
+          on true;
+ ?column?
+----------
+(0 rows)
+
 --
 -- check a case in which a PlaceHolderVar forces join order
 --
diff --git a/src/test/regress/sql/join.sql b/src/test/regress/sql/join.sql
index 55080bec9a..38899ed3b9 100644
--- a/src/test/regress/sql/join.sql
+++ b/src/test/regress/sql/join.sql
@@ -1751,6 +1751,29 @@ select 1 from
   on false,
   lateral (select i4.f1, ss1.n from int8_tbl as i8 limit 1) as ss3;

+explain (costs off)
+select 1 from tenk1 t1
+         join lateral
+           (select t1.unique1 from (select 1) foo offset 0) s1 on true
+         join
+            (select 1 from tenk1 t2
+                inner join tenk1 t3
+                 left join tenk1 t4 left join tenk1 t5 on t4.unique1 = 1
+                on t4.unique1 = 1 on false
+             where t3.unique1 = coalesce(t5.unique1,1)) as s2
+          on true;
+
+select 1 from tenk1 t1
+         join lateral
+           (select t1.unique1 from (select 1) foo offset 0) s1 on true
+         join
+            (select 1 from tenk1 t2
+                inner join tenk1 t3
+                 left join tenk1 t4 left join tenk1 t5 on t4.unique1 = 1
+                on t4.unique1 = 1 on false
+             where t3.unique1 = coalesce(t5.unique1,1)) as s2
+          on true;
+
 --
 -- check a case in which a PlaceHolderVar forces join order
 --

Re: Assert !bms_overlap(joinrel->relids, required_outer)

От
Richard Guo
Дата:

On Wed, Jun 28, 2023 at 10:09 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
However, given that what we need is to exclude parameterization
that depends on the currently-formed OJ, it seems to me we can do
it more simply and without any new JoinPathExtraData field,
as attached.  What do you think?

I think it makes sense.  At first I wondered if we should also exclude
parameterization that depends on OJs that have already been formed as
part of this joinrel.  But it seems not possible that the input paths
have parameterization dependency on these OJs.  So it should be
sufficient to only consider the currently-formed OJ.
 
> * I think we need to check the incompatible relids also in
> try_hashjoin_path and try_mergejoin_path besides try_nestloop_path.

I think this isn't necessary, at least in my formulation.
Those cases will go through calc_non_nestloop_required_outer
which has

        /* neither path can require rels from the other */
        Assert(!bms_overlap(outer_paramrels, inner_path->parent->relids));
        Assert(!bms_overlap(inner_paramrels, outer_path->parent->relids));

In order to have a dependency on an OJ, a path would have to have
a dependency on at least one of the OJ's base relations too, so
I think these assertions show that the case won't arise.  (Of
course, if someone can trip one of these assertions, I'm wrong.)

Hmm, while this holds in most cases, it does not if the joins have been
commuted according to identity 3.  If we change the t3/t4 join's qual to
't3.a = t4.a' to make hashjoin possible, we'd see the same Assert
failure through try_hashjoin_path.  I think it's also possible for merge
join.

explain (costs off)
select 1 from t t1
         join lateral
           (select t1.a from (select 1) foo offset 0) s1 on true
         join
            (select 1 from t t2
                inner join t t3
                 left join t t4 left join t t5 on t4.a = 1
                on t3.a = t4.a on false
             where t3.a = coalesce(t5.a,1)) as s2
          on true;
server closed the connection unexpectedly

Thanks
Richard

Re: Assert !bms_overlap(joinrel->relids, required_outer)

От
Richard Guo
Дата:

On Thu, Jun 29, 2023 at 10:39 AM Richard Guo <guofenglinux@gmail.com> wrote:
On Wed, Jun 28, 2023 at 10:09 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
However, given that what we need is to exclude parameterization
that depends on the currently-formed OJ, it seems to me we can do
it more simply and without any new JoinPathExtraData field,
as attached.  What do you think?

I think it makes sense.  At first I wondered if we should also exclude
parameterization that depends on OJs that have already been formed as
part of this joinrel.  But it seems not possible that the input paths
have parameterization dependency on these OJs.  So it should be
sufficient to only consider the currently-formed OJ.

BTW, it seems that extra->sjinfo would always have a valid value here.
So maybe we do not need to check if it is not NULL explicitly?

Thanks
Richard

Re: Assert !bms_overlap(joinrel->relids, required_outer)

От
Richard Guo
Дата:

On Wed, Jun 28, 2023 at 10:09 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Those cases will go through calc_non_nestloop_required_outer
which has

        /* neither path can require rels from the other */
        Assert(!bms_overlap(outer_paramrels, inner_path->parent->relids));
        Assert(!bms_overlap(inner_paramrels, outer_path->parent->relids));

Looking at these two assertions it occurred to me that shouldn't we
check against top_parent_relids for an otherrel since paths are
parameterized by top-level parents?  We do that in try_nestloop_path.

    /* neither path can require rels from the other */
-   Assert(!bms_overlap(outer_paramrels, inner_path->parent->relids));
-   Assert(!bms_overlap(inner_paramrels, outer_path->parent->relids));
+   Assert(!bms_overlap(outer_paramrels,
+                       inner_path->parent->top_parent_relids ?
+                       inner_path->parent->top_parent_relids :
+                       inner_path->parent->relids));
+   Assert(!bms_overlap(inner_paramrels,
+                       outer_path->parent->top_parent_relids ?
+                       outer_path->parent->top_parent_relids :
+                       outer_path->parent->relids));

This is not related to the issue being discussed here.  Maybe it should
be a separate issue.

Thanks
Richard

Re: Assert !bms_overlap(joinrel->relids, required_outer)

От
Tom Lane
Дата:
Richard Guo <guofenglinux@gmail.com> writes:
> BTW, it seems that extra->sjinfo would always have a valid value here.
> So maybe we do not need to check if it is not NULL explicitly?

Right, I was being conservative but this module expects that to
always be provided.

Pushed with that and defenses added to try_mergejoin_path and
try_hashjoin_path.  It looks like the various try_partial_xxx_path
functions already reject cases that could be problematic.  (They
will not accept inner parameterization that would lead to the
result being parameterized differently from the outer path.
By induction, that should be fine.)

            regards, tom lane



Re: Assert !bms_overlap(joinrel->relids, required_outer)

От
Tom Lane
Дата:
Richard Guo <guofenglinux@gmail.com> writes:
> On Wed, Jun 28, 2023 at 10:09 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Those cases will go through calc_non_nestloop_required_outer
>> which has
>> /* neither path can require rels from the other */
>> Assert(!bms_overlap(outer_paramrels, inner_path->parent->relids));
>> Assert(!bms_overlap(inner_paramrels, outer_path->parent->relids));

> Looking at these two assertions it occurred to me that shouldn't we
> check against top_parent_relids for an otherrel since paths are
> parameterized by top-level parents?  We do that in try_nestloop_path.

Yeah, while looking at this I was wondering why try_mergejoin_path and
try_hashjoin_path don't do the same "Paths are parameterized by
top-level parents, so run parameterization tests on the parent relids"
dance that try_nestloop_path does.  This omission is consistent with
that, but it's not obvious why it'd be okay to skip it for
non-nestloop joins.  I guess we'd have noticed by now if it wasn't
okay, but ...

            regards, tom lane



Re: Assert !bms_overlap(joinrel->relids, required_outer)

От
Richard Guo
Дата:

On Fri, Jun 30, 2023 at 12:16 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Pushed with that and defenses added to try_mergejoin_path and
try_hashjoin_path.  It looks like the various try_partial_xxx_path
functions already reject cases that could be problematic.  (They
will not accept inner parameterization that would lead to the
result being parameterized differently from the outer path.
By induction, that should be fine.)

Thanks for pushing it!

Yeah, I also checked that and there is no problem with partial join
paths.  However I found some opportunities for trivial revises there and
created a new patch for those at [1].

[1] https://www.postgresql.org/message-id/flat/CAMbWs48mKJ6g_GnYNa7dnw04MHaMK-jnAEBrMVhTp2uUg3Ut4A%40mail.gmail.com

Thanks
Richard

Re: Assert !bms_overlap(joinrel->relids, required_outer)

От
Richard Guo
Дата:

On Fri, Jun 30, 2023 at 12:20 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Richard Guo <guofenglinux@gmail.com> writes:
> On Wed, Jun 28, 2023 at 10:09 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Those cases will go through calc_non_nestloop_required_outer
>> which has
>> /* neither path can require rels from the other */
>> Assert(!bms_overlap(outer_paramrels, inner_path->parent->relids));
>> Assert(!bms_overlap(inner_paramrels, outer_path->parent->relids));

> Looking at these two assertions it occurred to me that shouldn't we
> check against top_parent_relids for an otherrel since paths are
> parameterized by top-level parents?  We do that in try_nestloop_path.

Yeah, while looking at this I was wondering why try_mergejoin_path and
try_hashjoin_path don't do the same "Paths are parameterized by
top-level parents, so run parameterization tests on the parent relids"
dance that try_nestloop_path does.  This omission is consistent with
that, but it's not obvious why it'd be okay to skip it for
non-nestloop joins.  I guess we'd have noticed by now if it wasn't
okay, but ...

I think it just makes these two assertions meaningless to skip it for
non-nestloop joins if the input paths are for otherrels, because paths
would never be parameterized by the member relations.  So these two
assertions would always be true for otherrel paths.  I think this is why
we have not noticed any problem by now.

However, this is not what we want.  What we want is to verify that
neither input path for the joinrel can require rels from the other, even
for otherrel paths.  So I think the current code is not right for that.
We need to check against top_parent_relids for otherrel paths, and that
would make these assertions meaningful.

Thanks
Richard

Re: Assert !bms_overlap(joinrel->relids, required_outer)

От
Tom Lane
Дата:
Richard Guo <guofenglinux@gmail.com> writes:
> On Fri, Jun 30, 2023 at 12:20 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Yeah, while looking at this I was wondering why try_mergejoin_path and
>> try_hashjoin_path don't do the same "Paths are parameterized by
>> top-level parents, so run parameterization tests on the parent relids"
>> dance that try_nestloop_path does.  This omission is consistent with
>> that, but it's not obvious why it'd be okay to skip it for
>> non-nestloop joins.  I guess we'd have noticed by now if it wasn't
>> okay, but ...

> I think it just makes these two assertions meaningless to skip it for
> non-nestloop joins if the input paths are for otherrels, because paths
> would never be parameterized by the member relations.  So these two
> assertions would always be true for otherrel paths.  I think this is why
> we have not noticed any problem by now.

After studying this some more, I think that maybe it's the "run
parameterization tests on the parent relids" bit that is misguided.
I believe the way it's really working is that all paths arriving
here are parameterized by top parents, because that's the only thing
we generate to start with.  A path can only become parameterized
by an otherrel when we apply reparameterize_path_by_child to it.
But the only place that happens is in try_nestloop_path itself
(or try_partial_nestloop_path), and then we immediately wrap it in
a nestloop join node, which becomes a child of an append that's
forming a partitionwise join.  The partitionwise join as a
whole won't be parameterized by any child rels.  So I think that
a path that's parameterized by a child rel can't exist "in the wild"
in a way that would allow it to get fed to one of the try_xxx_path
functions.  This explains why the seeming oversights in the merge
and hash cases aren't causing a problem.

If this theory is correct, we could simplify try_nestloop_path a
bit.  I doubt the code savings would matter, but maybe it's
worth changing for clarity's sake.

            regards, tom lane



Re: Assert !bms_overlap(joinrel->relids, required_outer)

От
Richard Guo
Дата:

On Fri, Jun 30, 2023 at 11:00 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Richard Guo <guofenglinux@gmail.com> writes:
> I think it just makes these two assertions meaningless to skip it for
> non-nestloop joins if the input paths are for otherrels, because paths
> would never be parameterized by the member relations.  So these two
> assertions would always be true for otherrel paths.  I think this is why
> we have not noticed any problem by now.

After studying this some more, I think that maybe it's the "run
parameterization tests on the parent relids" bit that is misguided.
I believe the way it's really working is that all paths arriving
here are parameterized by top parents, because that's the only thing
we generate to start with.  A path can only become parameterized
by an otherrel when we apply reparameterize_path_by_child to it.
But the only place that happens is in try_nestloop_path itself
(or try_partial_nestloop_path), and then we immediately wrap it in
a nestloop join node, which becomes a child of an append that's
forming a partitionwise join.  The partitionwise join as a
whole won't be parameterized by any child rels.  So I think that
a path that's parameterized by a child rel can't exist "in the wild"
in a way that would allow it to get fed to one of the try_xxx_path
functions.  This explains why the seeming oversights in the merge
and hash cases aren't causing a problem.

If this theory is correct, we could simplify try_nestloop_path a
bit.  I doubt the code savings would matter, but maybe it's
worth changing for clarity's sake.

Yeah, I think this theory is correct that all paths arriving at
try_xxx_path are parameterized by top parents.  But I do not get how to
simplify try_nestloop_path on the basis of that.  Would you please
elaborate on that?

Thanks
Richard