Обсуждение: Oversight in reparameterize_path_by_child leading to executor crash

Поиск
Список
Период
Сортировка

Oversight in reparameterize_path_by_child leading to executor crash

От
Richard Guo
Дата:
For paths of type 'T_Path', reparameterize_path_by_child just does the
flat-copy but does not adjust the expressions that have lateral
references.  This would have problems for partitionwise-join.  As an
example, consider

regression=# explain (costs off)
select * from prt1 t1 join lateral
    (select * from prt1 t2 TABLESAMPLE SYSTEM (t1.a)) s
on t1.a = s.a;
               QUERY PLAN
-----------------------------------------
 Append
   ->  Nested Loop
         ->  Seq Scan on prt1_p1 t1_1
         ->  Sample Scan on prt1_p1 t2_1
               Sampling: system (t1.a)
               Filter: (t1_1.a = a)
   ->  Nested Loop
         ->  Seq Scan on prt1_p2 t1_2
         ->  Sample Scan on prt1_p2 t2_2
               Sampling: system (t1.a)
               Filter: (t1_2.a = a)
   ->  Nested Loop
         ->  Seq Scan on prt1_p3 t1_3
         ->  Sample Scan on prt1_p3 t2_3
               Sampling: system (t1.a)
               Filter: (t1_3.a = a)
(16 rows)

Note that the lateral references in the sampling info are not
reparameterized correctly.  They are supposed to reference the child
relations, but as we can see from the plan they are still referencing
the top parent relation.  Running this plan would lead to executor
crash.

regression=# explain (analyze, costs off)
select * from prt1 t1 join lateral
    (select * from prt1 t2 TABLESAMPLE SYSTEM (t1.a)) s
on t1.a = s.a;
server closed the connection unexpectedly

In this case what we need to do is to adjust the TableSampleClause to
refer to the correct child relations.  We can do that with the help of
adjust_appendrel_attrs_multilevel().  One problem is that the
TableSampleClause is stored in RangeTblEntry, and it does not seem like
a good practice to alter RangeTblEntry in this place.  What if we end up
choosing the non-partitionwise-join path as the cheapest one?  In that
case altering the RangeTblEntry here would cause a problem of the
opposite side: the lateral references in TableSampleClause should refer
to the top parent relation but they are referring to the child
relations.

So what I'm thinking is that maybe we can add a new type of path, named
SampleScanPath, to have the TableSampleClause per path.  Then we can
safely reparameterize the TableSampleClause as needed for each
SampleScanPath.  That's what the attached patch does.

There are some other plan types that do not have a separate path type
but may have lateral references in expressions stored in RangeTblEntry,
such as FunctionScan, TableFuncScan and ValuesScan.  But it's not clear
to me if there are such problems with them too.

Thanks
Richard
Вложения

Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Tom Lane
Дата:
Richard Guo <guofenglinux@gmail.com> writes:
> In this case what we need to do is to adjust the TableSampleClause to
> refer to the correct child relations.  We can do that with the help of
> adjust_appendrel_attrs_multilevel().  One problem is that the
> TableSampleClause is stored in RangeTblEntry, and it does not seem like
> a good practice to alter RangeTblEntry in this place.

Ugh.  That's why we didn't think to adjust it, obviously.  You are right
that we can't just modify the RTE on the fly, since it's shared with
other non-parameterized paths.

> So what I'm thinking is that maybe we can add a new type of path, named
> SampleScanPath, to have the TableSampleClause per path.  Then we can
> safely reparameterize the TableSampleClause as needed for each
> SampleScanPath.  That's what the attached patch does.

Alternatively, could we postpone the reparameterization until
createplan.c?  Having to build reparameterized expressions we might
not use seems expensive, and it would also contribute to the memory
bloat being complained of in nearby threads.

> There are some other plan types that do not have a separate path type
> but may have lateral references in expressions stored in RangeTblEntry,
> such as FunctionScan, TableFuncScan and ValuesScan.  But it's not clear
> to me if there are such problems with them too.

Hmm, well, not for partition-wise join anyway.

            regards, tom lane



Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Richard Guo
Дата:

On Tue, Aug 1, 2023 at 9:20 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Richard Guo <guofenglinux@gmail.com> writes:
> So what I'm thinking is that maybe we can add a new type of path, named
> SampleScanPath, to have the TableSampleClause per path.  Then we can
> safely reparameterize the TableSampleClause as needed for each
> SampleScanPath.  That's what the attached patch does.

Alternatively, could we postpone the reparameterization until
createplan.c?  Having to build reparameterized expressions we might
not use seems expensive, and it would also contribute to the memory
bloat being complained of in nearby threads.

I did think about this option but figured out that it seems beyond the
scope of just fixing SampleScan.  But if we want to optimize the
reparameterization mechanism besides fixing this crash, I think this
option is much better.  I drafted a patch as attached.

Thanks
Richard
Вложения

Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Ashutosh Bapat
Дата:
On Tue, Aug 1, 2023 at 6:50 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

>
> Alternatively, could we postpone the reparameterization until
> createplan.c?  Having to build reparameterized expressions we might
> not use seems expensive, and it would also contribute to the memory
> bloat being complained of in nearby threads.
>

As long as the translation happens only once, it should be fine. It's
the repeated translations which should be avoided.

--
Best Wishes,
Ashutosh Bapat



Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Richard Guo
Дата:

On Thu, Aug 3, 2023 at 12:38 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote:
On Tue, Aug 1, 2023 at 6:50 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Alternatively, could we postpone the reparameterization until
> createplan.c?  Having to build reparameterized expressions we might
> not use seems expensive, and it would also contribute to the memory
> bloat being complained of in nearby threads.

As long as the translation happens only once, it should be fine. It's
the repeated translations which should be avoided.

Sometimes it would help to avoid the translation at all if we postpone
the reparameterization until createplan.c, such as in the case that a
non-parameterized path wins at last.  For instance, for the query below

regression=# explain (costs off)
select * from prt1 t1 join prt1 t2 on t1.a = t2.a;
                 QUERY PLAN
--------------------------------------------
 Append
   ->  Hash Join
         Hash Cond: (t1_1.a = t2_1.a)
         ->  Seq Scan on prt1_p1 t1_1
         ->  Hash
               ->  Seq Scan on prt1_p1 t2_1
   ->  Hash Join
         Hash Cond: (t1_2.a = t2_2.a)
         ->  Seq Scan on prt1_p2 t1_2
         ->  Hash
               ->  Seq Scan on prt1_p2 t2_2
   ->  Hash Join
         Hash Cond: (t1_3.a = t2_3.a)
         ->  Seq Scan on prt1_p3 t1_3
         ->  Hash
               ->  Seq Scan on prt1_p3 t2_3
(16 rows)

Our current code would reparameterize each inner paths although at last
we choose the non-parameterized paths.  If we do the reparameterization
in createplan.c, we would be able to avoid all the translations.

Thanks
Richard

Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Ashutosh Bapat
Дата:


On Thu, Aug 3, 2023 at 4:14 PM Richard Guo <guofenglinux@gmail.com> wrote:

On Thu, Aug 3, 2023 at 12:38 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote:
 
Sometimes it would help to avoid the translation at all if we postpone
the reparameterization until createplan.c, such as in the case that a
non-parameterized path wins at last.  For instance, for the query below

regression=# explain (costs off)
select * from prt1 t1 join prt1 t2 on t1.a = t2.a;
                 QUERY PLAN
--------------------------------------------
 Append
   ->  Hash Join
         Hash Cond: (t1_1.a = t2_1.a)
         ->  Seq Scan on prt1_p1 t1_1
         ->  Hash
               ->  Seq Scan on prt1_p1 t2_1
   ->  Hash Join
         Hash Cond: (t1_2.a = t2_2.a)
         ->  Seq Scan on prt1_p2 t1_2
         ->  Hash
               ->  Seq Scan on prt1_p2 t2_2
   ->  Hash Join
         Hash Cond: (t1_3.a = t2_3.a)
         ->  Seq Scan on prt1_p3 t1_3
         ->  Hash
               ->  Seq Scan on prt1_p3 t2_3
(16 rows)

Our current code would reparameterize each inner paths although at last
we choose the non-parameterized paths.  If we do the reparameterization
in createplan.c, we would be able to avoid all the translations.


I agree. The costs do not change because of reparameterization. The process of creating paths is to estimate costs of different plans. So I think it makes sense to delay anything which doesn't contribute to costing till the final plan is decided.

However, if reparameterization can *not* happen at the planning time, we have chosen a plan which can not be realised meeting a dead end. So as long as we can ensure that the reparameterization is possible we can delay actual translations. I think it should be possible to decide whether reparameterization is possible based on the type of path alone. So seems doable.

--
Best Wishes,
Ashutosh Bapat

Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Richard Guo
Дата:

On Thu, Aug 3, 2023 at 7:20 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote:
However, if reparameterization can *not* happen at the planning time, we have chosen a plan which can not be realised meeting a dead end. So as long as we can ensure that the reparameterization is possible we can delay actual translations. I think it should be possible to decide whether reparameterization is possible based on the type of path alone. So seems doable.

That has been done in v2 patch.  See the new added function
path_is_reparameterizable_by_child().

Thanks
Richard

Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Ashutosh Bapat
Дата:


On Fri, Aug 4, 2023 at 6:08 AM Richard Guo <guofenglinux@gmail.com> wrote:

On Thu, Aug 3, 2023 at 7:20 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote:
However, if reparameterization can *not* happen at the planning time, we have chosen a plan which can not be realised meeting a dead end. So as long as we can ensure that the reparameterization is possible we can delay actual translations. I think it should be possible to decide whether reparameterization is possible based on the type of path alone. So seems doable.

That has been done in v2 patch.  See the new added function
path_is_reparameterizable_by_child().

Thanks. The patch looks good overall. I like it because of its potential to reduce memory consumption in reparameterization. That's cake on top of cream :)

Some comments here.

+ {
+ FLAT_COPY_PATH(new_path, path, Path);
+
+ if (path->pathtype == T_SampleScan)
+ {
+ Index scan_relid = path->parent->relid;
+ RangeTblEntry *rte;
+
+ /* it should be a base rel with a tablesample clause... */
+ Assert(scan_relid > 0);
+ rte = planner_rt_fetch(scan_relid, root);
+ Assert(rte->rtekind == RTE_RELATION);
+ Assert(rte->tablesample != NULL);
+
+ ADJUST_CHILD_EXPRS(rte->tablesample, TableSampleClause *);
+ }
+ }

This change makes this function usable only after the final plan has been
selected. If some code accidently uses it earlier, it would corrupt
rte->tablesample without getting detected easily. I think we should mention
this in the function prologue and move it somewhere else. Other option is to a.
leave rte->tablesample unchanged here with a comment as to why we aren't
changing it b. reparameterize tablesample expression in
create_samplescan_plan() if the path is parameterized by the parent. We call
reparameterize_path_by_child() in create_nestloop_plan() as this patch does
right now. I like that we are delaying reparameterization. It saves a bunch of
memory. I haven't measured it but from my recent experiments I know it will be
a lot.

  * If the inner path is parameterized, it is parameterized by the
- * topmost parent of the outer rel, not the outer rel itself.  Fix
- * that.
+ * topmost parent of the outer rel, not the outer rel itself.  We will

There's something wrong with the original sentence (which was probably written
by me back then :)). I think it should have read "If the inner path is
parameterized by the topmost parent of the outer rel instead of the outer rel
itself, fix that.". If you agree, the new comment should change it accordingly.

@@ -2430,6 +2430,16 @@ create_nestloop_path(PlannerInfo *root,
 {
  NestPath   *pathnode = makeNode(NestPath);
  Relids inner_req_outer = PATH_REQ_OUTER(inner_path);
+ Relids outerrelids;
+
+ /*
+ * Paths are parameterized by top-level parents, so run parameterization
+ * tests on the parent relids.
+ */
+ if (outer_path->parent->top_parent_relids)
+ outerrelids = outer_path->parent->top_parent_relids;
+ else
+ outerrelids = outer_path->parent->relids;

This looks like an existing bug. Did partitionwise join paths ended up with
extra restrict clauses without this fix? I am not able to see why this change
is needed by rest of the changes in the patch?

Anyway, let's rename outerrelids to something else e.g.  outer_param_relids to reflect
its true meaning.

+bool
+path_is_reparameterizable_by_child(Path *path)
+{
+ switch (nodeTag(path))
+ {
+ case T_Path:
... snip ...
+ case T_GatherPath:
+ return true;
+ default:
+
+ /* We don't know how to reparameterize this path. */
+ return false;
+ }
+
+ return false;
+}

How do we make sure that any changes to reparameterize_path_by_child() are
reflected here? One way is to call this function from
reparameterize_path_by_child() the first thing and return if the path can not
be reparameterized.

I haven't gone through each path's translation to make sure that it's possible
to do that during create_nestloop_plan(). I will do that in my next round of
review if time permits. But AFAIR, each of the translations are possible during
create_nestloop_plan() when it happens in this patch.

-#define ADJUST_CHILD_ATTRS(node) \
+#define ADJUST_CHILD_EXPRS(node, fieldtype) \
  ((node) = \
- (List *) adjust_appendrel_attrs_multilevel(root, (Node *) (node), \
- child_rel, \
- child_rel->top_parent))
+ (fieldtype) adjust_appendrel_attrs_multilevel(root, (Node *) (node), \
+   child_rel, \
+   child_rel->top_parent))
+
+#define ADJUST_CHILD_ATTRS(node) ADJUST_CHILD_EXPRS(node, List *)

IIRC, ATTRS here is taken from the function being called. I think we should
just change the macro definition, not its name. If you would like to have a
separate macro for List nodes, create ADJUST_CHILD_ATTRS_IN_LIST or some such.

--
Best Wishes,
Ashutosh Bapat

Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Richard Guo
Дата:

On Mon, Aug 7, 2023 at 9:29 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote:
Thanks. The patch looks good overall. I like it because of its potential to reduce memory consumption in reparameterization. That's cake on top of cream :)

Thanks for reviewing this patch!
 
Some comments here.

+ {
+ FLAT_COPY_PATH(new_path, path, Path);
+
+ if (path->pathtype == T_SampleScan)
+ {
+ Index scan_relid = path->parent->relid;
+ RangeTblEntry *rte;
+
+ /* it should be a base rel with a tablesample clause... */
+ Assert(scan_relid > 0);
+ rte = planner_rt_fetch(scan_relid, root);
+ Assert(rte->rtekind == RTE_RELATION);
+ Assert(rte->tablesample != NULL);
+
+ ADJUST_CHILD_EXPRS(rte->tablesample, TableSampleClause *);
+ }
+ }

This change makes this function usable only after the final plan has been
selected. If some code accidently uses it earlier, it would corrupt
rte->tablesample without getting detected easily. I think we should mention
this in the function prologue and move it somewhere else. Other option is to a.
leave rte->tablesample unchanged here with a comment as to why we aren't
changing it b. reparameterize tablesample expression in
create_samplescan_plan() if the path is parameterized by the parent. We call
reparameterize_path_by_child() in create_nestloop_plan() as this patch does
right now. I like that we are delaying reparameterization. It saves a bunch of
memory. I haven't measured it but from my recent experiments I know it will be
a lot.

I agree that we should mention in the function's comment that
reparameterize_path_by_child can only be used after the best path has
been selected because the RTE might be modified by this function.  I'm
not sure if it's a good idea to move the reparameterization of
tablesample to create_samplescan_plan().  That would make the
reparameterization work separated in different places.  And in the
future we may find more cases where we need modify RTEs or RELs for
reparameterization.  It seems better to me that we keep all the work in
one place.
 
  * If the inner path is parameterized, it is parameterized by the
- * topmost parent of the outer rel, not the outer rel itself.  Fix
- * that.
+ * topmost parent of the outer rel, not the outer rel itself.  We will

There's something wrong with the original sentence (which was probably written
by me back then :)). I think it should have read "If the inner path is
parameterized by the topmost parent of the outer rel instead of the outer rel
itself, fix that.". If you agree, the new comment should change it accordingly.

Right.  Will do that.
 
@@ -2430,6 +2430,16 @@ create_nestloop_path(PlannerInfo *root,
 {
  NestPath   *pathnode = makeNode(NestPath);
  Relids inner_req_outer = PATH_REQ_OUTER(inner_path);
+ Relids outerrelids;
+
+ /*
+ * Paths are parameterized by top-level parents, so run parameterization
+ * tests on the parent relids.
+ */
+ if (outer_path->parent->top_parent_relids)
+ outerrelids = outer_path->parent->top_parent_relids;
+ else
+ outerrelids = outer_path->parent->relids;

This looks like an existing bug. Did partitionwise join paths ended up with
extra restrict clauses without this fix? I am not able to see why this change
is needed by rest of the changes in the patch?

This is not an existing bug.  Our current code (without this patch)
would always call create_nestloop_path() after the reparameterization
work has been done for the inner path.  So we can safely check against
outer rel (not its topmost parent) in create_nestloop_path().  But now
with this patch, the reparameterization is postponed until createplan.c,
so we have to check against the topmost parent of outer rel in
create_nestloop_path(), otherwise we'd have duplicate clauses in the
final plan.  For instance, without this change you'd get a plan like

regression=# explain (costs off)
select * from prt1 t1 left join lateral
    (select t1.a as t1a, t2.* from prt1 t2) s on t1.a = s.a;
                       QUERY PLAN
---------------------------------------------------------
 Append
   ->  Nested Loop Left Join
         Join Filter: (t1_1.a = t2_1.a)
         ->  Seq Scan on prt1_p1 t1_1
         ->  Index Scan using iprt1_p1_a on prt1_p1 t2_1
               Index Cond: (a = t1_1.a)
   ->  Nested Loop Left Join
         Join Filter: (t1_2.a = t2_2.a)
         ->  Seq Scan on prt1_p2 t1_2
         ->  Index Scan using iprt1_p2_a on prt1_p2 t2_2
               Index Cond: (a = t1_2.a)
   ->  Nested Loop Left Join
         Join Filter: (t1_3.a = t2_3.a)
         ->  Seq Scan on prt1_p3 t1_3
         ->  Index Scan using iprt1_p3_a on prt1_p3 t2_3
               Index Cond: (a = t1_3.a)
(16 rows)

Note that the clauses in Join Filter are duplicate.
 
Anyway, let's rename outerrelids to something else e.g.  outer_param_relids to reflect
its true meaning.

Hmm, I'm not sure this is a good idea.  Here the 'outerrelids' is just
the relids of the outer rel or outer rel's topmost parent.  I think it's
better to keep it as-is, and this is consistent with 'outerrelids' in
try_nestloop_path().
 
+bool
+path_is_reparameterizable_by_child(Path *path)
+{
+ switch (nodeTag(path))
+ {
+ case T_Path:
... snip ...
+ case T_GatherPath:
+ return true;
+ default:
+
+ /* We don't know how to reparameterize this path. */
+ return false;
+ }
+
+ return false;
+}

How do we make sure that any changes to reparameterize_path_by_child() are
reflected here? One way is to call this function from
reparameterize_path_by_child() the first thing and return if the path can not
be reparameterized.

Good question.  But I don't think calling this function at the beginning
of reparameterize_path_by_child() can solve this problem.  Even if we do
that, if we find that the path cannot be reparameterized in
reparameterize_path_by_child(), it would be too late for us to take any
measures.  So we need to make sure that such situation cannot happen.  I
think we can emphasize this point in the comments of the two functions,
and meanwhile add an Assert in reparameterize_path_by_child() that the
path must be reparameterizable.
 
-#define ADJUST_CHILD_ATTRS(node) \
+#define ADJUST_CHILD_EXPRS(node, fieldtype) \
  ((node) = \
- (List *) adjust_appendrel_attrs_multilevel(root, (Node *) (node), \
- child_rel, \
- child_rel->top_parent))
+ (fieldtype) adjust_appendrel_attrs_multilevel(root, (Node *) (node), \
+   child_rel, \
+   child_rel->top_parent))
+
+#define ADJUST_CHILD_ATTRS(node) ADJUST_CHILD_EXPRS(node, List *)

IIRC, ATTRS here is taken from the function being called. I think we should
just change the macro definition, not its name. If you would like to have a
separate macro for List nodes, create ADJUST_CHILD_ATTRS_IN_LIST or some such.

Agreed.  I introduced the new macro ADJUST_CHILD_EXPRS just to keep
macro ADJUST_CHILD_ATTRS as-is, so callers to ADJUST_CHILD_ATTRS can
keep unchanged.  It seems better to just change ADJUST_CHILD_ATTRS as
well as its callers.  Will do that.

Attaching v3 patch to address all the reviews above.

Thanks
Richard
Вложения

Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Ashutosh Bapat
Дата:
On Tue, Aug 8, 2023 at 12:44 PM Richard Guo <guofenglinux@gmail.com> wrote:
>
>
> I agree that we should mention in the function's comment that
> reparameterize_path_by_child can only be used after the best path has
> been selected because the RTE might be modified by this function.  I'm
> not sure if it's a good idea to move the reparameterization of
> tablesample to create_samplescan_plan().  That would make the
> reparameterization work separated in different places.  And in the
> future we may find more cases where we need modify RTEs or RELs for
> reparameterization.  It seems better to me that we keep all the work in
> one place.

Let's see what a committer thinks. Let's leave it like that for now.

>
>
> This is not an existing bug.  Our current code (without this patch)
> would always call create_nestloop_path() after the reparameterization
> work has been done for the inner path.  So we can safely check against
> outer rel (not its topmost parent) in create_nestloop_path().  But now
> with this patch, the reparameterization is postponed until createplan.c,
> so we have to check against the topmost parent of outer rel in
> create_nestloop_path(), otherwise we'd have duplicate clauses in the
> final plan.

Hmm. I am worried about the impact this will have when the nestloop
path is created for two child relations. Your code will still pick the
top_parent_relids which seems odd. Have you tested that case?

> For instance, without this change you'd get a plan like
>
> regression=# explain (costs off)
> select * from prt1 t1 left join lateral
>     (select t1.a as t1a, t2.* from prt1 t2) s on t1.a = s.a;
>                        QUERY PLAN
... snip ...
>
> Note that the clauses in Join Filter are duplicate.

Thanks for the example.

>
> Good question.  But I don't think calling this function at the beginning
> of reparameterize_path_by_child() can solve this problem.  Even if we do
> that, if we find that the path cannot be reparameterized in
> reparameterize_path_by_child(), it would be too late for us to take any
> measures.  So we need to make sure that such situation cannot happen.  I
> think we can emphasize this point in the comments of the two functions,
> and meanwhile add an Assert in reparameterize_path_by_child() that the
> path must be reparameterizable.

Works for me.

>
> Attaching v3 patch to address all the reviews above.

The patch looks good to me.

Please add this to the next commitfest.

--
Best Wishes,
Ashutosh Bapat



Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Richard Guo
Дата:

On Tue, Aug 8, 2023 at 7:34 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote:
On Tue, Aug 8, 2023 at 12:44 PM Richard Guo <guofenglinux@gmail.com> wrote:
> This is not an existing bug.  Our current code (without this patch)
> would always call create_nestloop_path() after the reparameterization
> work has been done for the inner path.  So we can safely check against
> outer rel (not its topmost parent) in create_nestloop_path().  But now
> with this patch, the reparameterization is postponed until createplan.c,
> so we have to check against the topmost parent of outer rel in
> create_nestloop_path(), otherwise we'd have duplicate clauses in the
> final plan.

Hmm. I am worried about the impact this will have when the nestloop
path is created for two child relations. Your code will still pick the
top_parent_relids which seems odd. Have you tested that case?

Well, the way it's working is that for child rels all parameterized
paths arriving at try_nestloop_path (or try_partial_nestloop_path) are
parameterized by top parents at first.  Then our current code (without
this patch) applies reparameterize_path_by_child to the inner path
before calling create_nestloop_path().  So in create_nestloop_path() the
inner path is parameterized by child rel.  That's why we check against
outer rel itself not its top parent there.

With this patch, the reparameterize_path_by_child work is postponed
until createplan time, so in create_nestloop_path() the inner path is
still parameterized by top parent.  So we have to check against the top
parent of outer rel.

I think the query shown upthread is sufficient to verify this theory.

regression=# explain (costs off)
select * from prt1 t1 left join lateral
    (select t1.a as t1a, t2.* from prt1 t2) s on t1.a = s.a;
                       QUERY PLAN
---------------------------------------------------------
 Append
   ->  Nested Loop Left Join
         ->  Seq Scan on prt1_p1 t1_1
         ->  Index Scan using iprt1_p1_a on prt1_p1 t2_1
               Index Cond: (a = t1_1.a)
   ->  Nested Loop Left Join
         ->  Seq Scan on prt1_p2 t1_2
         ->  Index Scan using iprt1_p2_a on prt1_p2 t2_2
               Index Cond: (a = t1_2.a)
   ->  Nested Loop Left Join
         ->  Seq Scan on prt1_p3 t1_3
         ->  Index Scan using iprt1_p3_a on prt1_p3 t2_3
               Index Cond: (a = t1_3.a)
(13 rows)
 
Please add this to the next commitfest.

Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Ashutosh Bapat
Дата:
On Wed, Aug 9, 2023 at 8:14 AM Richard Guo <guofenglinux@gmail.com> wrote:
>
> With this patch, the reparameterize_path_by_child work is postponed
> until createplan time, so in create_nestloop_path() the inner path is
> still parameterized by top parent.  So we have to check against the top
> parent of outer rel.
>

Your changes in create_nestloop_path() only affect the check for
parameterized path not the unparameterized paths. So I think we are
good there. My worry was misplaced.

--
Best Wishes,
Ashutosh Bapat



Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Tom Lane
Дата:
I looked over the v3 patch.  I think it's going in generally
the right direction, but there is a huge oversight:
path_is_reparameterizable_by_child does not come close to modeling
the success/failure conditions of reparameterize_path_by_child.
In all the cases where reparameterize_path_by_child can recurse,
path_is_reparameterizable_by_child has to do so also, to check
whether the child path is reparameterizable.  (I'm somewhat
disturbed that apparently we have no test cases that caught that
oversight; can we add one cheaply?)

BTW, I also note from the cfbot that 9e9931d2b broke this patch by
adding more ADJUST_CHILD_ATTRS calls that need to be modified.
I wonder if we could get away with having that macro cast to "void *"
to avoid needing to change all its call sites.  I'm not sure whether
pickier compilers might warn about doing it that way.

            regards, tom lane



Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Richard Guo
Дата:

On Sun, Aug 20, 2023 at 11:48 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
I looked over the v3 patch.  I think it's going in generally
the right direction, but there is a huge oversight:
path_is_reparameterizable_by_child does not come close to modeling
the success/failure conditions of reparameterize_path_by_child.
In all the cases where reparameterize_path_by_child can recurse,
path_is_reparameterizable_by_child has to do so also, to check
whether the child path is reparameterizable.  (I'm somewhat
disturbed that apparently we have no test cases that caught that
oversight; can we add one cheaply?)

Thanks for pointing this out.  It's my oversight.  We have to check the
child path(s) recursively to tell if a path is reparameterizable or not.
I've fixed this in v4 patch, along with a test case.  In the test case
we have a MemoizePath whose subpath is TidPath, which is not
reparameterizable.  This test case would trigger Assert in v3 patch.
 
BTW, I also note from the cfbot that 9e9931d2b broke this patch by
adding more ADJUST_CHILD_ATTRS calls that need to be modified.
I wonder if we could get away with having that macro cast to "void *"
to avoid needing to change all its call sites.  I'm not sure whether
pickier compilers might warn about doing it that way.

Agreed and have done that in v4 patch.

Thanks
Richard
Вложения

Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Ashutosh Bapat
Дата:
On Mon, Aug 21, 2023 at 4:09 PM Richard Guo <guofenglinux@gmail.com> wrote:
>
>
> On Sun, Aug 20, 2023 at 11:48 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>
>> I looked over the v3 patch.  I think it's going in generally
>> the right direction, but there is a huge oversight:
>> path_is_reparameterizable_by_child does not come close to modeling
>> the success/failure conditions of reparameterize_path_by_child.
>> In all the cases where reparameterize_path_by_child can recurse,
>> path_is_reparameterizable_by_child has to do so also, to check
>> whether the child path is reparameterizable.  (I'm somewhat
>> disturbed that apparently we have no test cases that caught that
>> oversight; can we add one cheaply?)
>
>
> Thanks for pointing this out.  It's my oversight.  We have to check the
> child path(s) recursively to tell if a path is reparameterizable or not.
> I've fixed this in v4 patch, along with a test case.  In the test case
> we have a MemoizePath whose subpath is TidPath, which is not
> reparameterizable.  This test case would trigger Assert in v3 patch.

This goes back to my question about how do we keep
path_is_reparameterizable_by_child() and
reparameterize_path_by_child() in sync with each other. This makes it
further hard to do so. One idea I have is to use the same function
(reparameterize_path_by_child()) in two modes 1. actual
reparameterization 2. assessing whether reparameterization is
possible. Essentially combing reparameterize_path_by_child() and
path_is_reparameterizable_by_child(). The name of such a function can
be chosen appropriately. The mode will be controlled by a fourth
argument to the function. When assessing a path no translations happen
and no extra memory is allocated.

--
Best Wishes,
Ashutosh Bapat



Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Tom Lane
Дата:
I spent some time reviewing the v4 patch.  I noted that
path_is_reparameterizable_by_child still wasn't modeling the pass/fail
behavior of reparameterize_path_by_child very well, because that
function checks this at every recursion level:

    /*
     * If the path is not parameterized by the parent of the given relation,
     * it doesn't need reparameterization.
     */
    if (!path->param_info ||
        !bms_overlap(PATH_REQ_OUTER(path), child_rel->top_parent_relids))
        return path;

So we might have a sub-path (eg a join input rel) that is not one of
the supported kinds, and yet we can succeed because parameterization
appears only in other sub-paths.  The patch as written will not crash
in such a case, but it might refuse to use a path that we previously
would have allowed.  So I think we need to put the same test into
path_is_reparameterizable_by_child, which requires adding child_rel
to its param list but is otherwise simple enough.

I also realized that this test is equivalent to 
PATH_PARAM_BY_PARENT(), which makes it really unnecessary for
createplan.c to test PATH_PARAM_BY_PARENT, so we don't need to
expose those macros globally after all.

(On the same logic, we could skip PATH_PARAM_BY_PARENT at the
call sites of path_is_reparameterizable_by_child.  I didn't
do that in the attached, mainly because it seems to make it
harder to understand/explain what is being tested.)

Another change I made is to put the path_is_reparameterizable_by_child
tests before the initial_cost_nestloop/add_path_precheck steps, on
the grounds that they're now cheap enough that we might as well do
them first.  The existing ordering of these steps was sensible when
we were doing the expensive reparameterization, but it seems a bit
unnatural IMO.

Lastly, although I'd asked for a test case demonstrating detection of
an unparameterizable sub-path, I ended up not using that, because
it seemed pretty fragile.  If somebody decides that
reparameterize_path_by_child ought to cover TidPaths, the test won't
prove anything any more.

So that led me to the attached v5, which seemed committable to me so I
set about back-patching it ... and it fell over immediately in v15, as
shown in the attached regression diffs from v15.  It looks to me like
we are now failing to recognize that reparameterized quals are
redundant with not-reparameterized ones, so this patch is somehow
dependent on restructuring that happened during the v16 cycle.
I don't have time to dig deeper than that, and I'm not sure that that
is an area we'd want to mess with in a back-patched bug fix.

What I'm thinking we ought to do instead for the back branches
is just refuse to generate a reparameterized path for tablesample
scans.  A minimal fix like that could be as little as

        case T_Path:
+           if (path->pathtype == T_SampleScan)
+               return NULL;
            FLAT_COPY_PATH(new_path, path, Path);
            break;

This rejects more than it absolutely has to, because the
parameterization (that we know exists) might be in the path's
regular quals or tlist rather than in the tablesample.
So we could add something to see if the tablesample is
parameter-free, but I'm quite unsure that it's worth the
trouble.  There must be just about nobody using such cases,
or we'd have heard of this bug long ago.

(BTW, I did look at Ashutosh's idea of merging the
reparameterize_path_by_child and path_is_reparameterizable_by_child
functions, but I didn't think that would be an improvement,
because we'd have to clutter reparameterize_path_by_child with
a lot of should-I-skip-this-step tests.  Some of that could be
hidden in the macros, but a lot would not be.  Another issue
is that I do not think we can change reparameterize_path_by_child's
API contract in the back branches, because we advertise it as
something that FDWs and custom scan providers can use.)

Thoughts?

            regards, tom lane

diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c
index 821d282497..e2ecf5b14b 100644
--- a/src/backend/optimizer/path/joinpath.c
+++ b/src/backend/optimizer/path/joinpath.c
@@ -30,8 +30,9 @@
 set_join_pathlist_hook_type set_join_pathlist_hook = NULL;

 /*
- * Paths parameterized by the parent can be considered to be parameterized by
- * any of its child.
+ * Paths parameterized by a parent rel can be considered to be parameterized
+ * by any of its children, when we are performing partitionwise joins.  These
+ * macros simplify checking for such cases.  Beware multiple eval of args.
  */
 #define PATH_PARAM_BY_PARENT(path, rel)    \
     ((path)->param_info && bms_overlap(PATH_REQ_OUTER(path),    \
@@ -762,6 +763,20 @@ try_nestloop_path(PlannerInfo *root,
     /* If we got past that, we shouldn't have any unsafe outer-join refs */
     Assert(!have_unsafe_outer_join_ref(root, outerrelids, inner_paramrels));

+    /*
+     * If the inner path is parameterized, it is parameterized by the topmost
+     * parent of the outer rel, not the outer rel itself.  We will need to
+     * translate the parameterization, if this path is chosen, during
+     * create_plan().  Here we just check whether we will be able to perform
+     * the translation, and if not avoid creating a nestloop path.
+     */
+    if (PATH_PARAM_BY_PARENT(inner_path, outer_path->parent) &&
+        !path_is_reparameterizable_by_child(inner_path, outer_path->parent))
+    {
+        bms_free(required_outer);
+        return;
+    }
+
     /*
      * Do a precheck to quickly eliminate obviously-inferior paths.  We
      * calculate a cheap lower bound on the path's cost and then use
@@ -778,27 +793,6 @@ try_nestloop_path(PlannerInfo *root,
                           workspace.startup_cost, workspace.total_cost,
                           pathkeys, required_outer))
     {
-        /*
-         * If the inner path is parameterized, it is parameterized by the
-         * topmost parent of the outer rel, not the outer rel itself.  Fix
-         * that.
-         */
-        if (PATH_PARAM_BY_PARENT(inner_path, outer_path->parent))
-        {
-            inner_path = reparameterize_path_by_child(root, inner_path,
-                                                      outer_path->parent);
-
-            /*
-             * If we could not translate the path, we can't create nest loop
-             * path.
-             */
-            if (!inner_path)
-            {
-                bms_free(required_outer);
-                return;
-            }
-        }
-
         add_path(joinrel, (Path *)
                  create_nestloop_path(root,
                                       joinrel,
@@ -861,6 +855,17 @@ try_partial_nestloop_path(PlannerInfo *root,
             return;
     }

+    /*
+     * If the inner path is parameterized, it is parameterized by the topmost
+     * parent of the outer rel, not the outer rel itself.  We will need to
+     * translate the parameterization, if this path is chosen, during
+     * create_plan().  Here we just check whether we will be able to perform
+     * the translation, and if not avoid creating a nestloop path.
+     */
+    if (PATH_PARAM_BY_PARENT(inner_path, outer_path->parent) &&
+        !path_is_reparameterizable_by_child(inner_path, outer_path->parent))
+        return;
+
     /*
      * Before creating a path, get a quick lower bound on what it is likely to
      * cost.  Bail out right away if it looks terrible.
@@ -870,22 +875,6 @@ try_partial_nestloop_path(PlannerInfo *root,
     if (!add_partial_path_precheck(joinrel, workspace.total_cost, pathkeys))
         return;

-    /*
-     * If the inner path is parameterized, it is parameterized by the topmost
-     * parent of the outer rel, not the outer rel itself.  Fix that.
-     */
-    if (PATH_PARAM_BY_PARENT(inner_path, outer_path->parent))
-    {
-        inner_path = reparameterize_path_by_child(root, inner_path,
-                                                  outer_path->parent);
-
-        /*
-         * If we could not translate the path, we can't create nest loop path.
-         */
-        if (!inner_path)
-            return;
-    }
-
     /* Might be good enough to be worth trying, so let's try it. */
     add_partial_path(joinrel, (Path *)
                      create_nestloop_path(root,
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 34ca6d4ac2..38bd179f4f 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -29,6 +29,7 @@
 #include "optimizer/cost.h"
 #include "optimizer/optimizer.h"
 #include "optimizer/paramassign.h"
+#include "optimizer/pathnode.h"
 #include "optimizer/paths.h"
 #include "optimizer/placeholder.h"
 #include "optimizer/plancat.h"
@@ -4346,6 +4347,22 @@ create_nestloop_plan(PlannerInfo *root,
     List       *nestParams;
     Relids        saveOuterRels = root->curOuterRels;

+    /*
+     * If the inner path is parameterized by the topmost parent of the outer
+     * rel rather than the outer rel itself, fix that.  (Nothing happens here
+     * if it is not so parameterized.)
+     */
+    best_path->jpath.innerjoinpath =
+        reparameterize_path_by_child(root,
+                                     best_path->jpath.innerjoinpath,
+                                     best_path->jpath.outerjoinpath->parent);
+
+    /*
+     * Failure here probably means that reparameterize_path_by_child() is not
+     * in sync with path_is_reparameterizable_by_child().
+     */
+    Assert(best_path->jpath.innerjoinpath != NULL);
+
     /* NestLoop can project, so no need to be picky about child tlists */
     outer_plan = create_plan_recurse(root, best_path->jpath.outerjoinpath, 0);

diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index 211ba65389..21d002243c 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -56,6 +56,8 @@ static int    append_startup_cost_compare(const ListCell *a, const ListCell *b);
 static List *reparameterize_pathlist_by_child(PlannerInfo *root,
                                               List *pathlist,
                                               RelOptInfo *child_rel);
+static bool pathlist_is_reparameterizable_by_child(List *pathlist,
+                                                   RelOptInfo *child_rel);


 /*****************************************************************************
@@ -2436,6 +2438,16 @@ create_nestloop_path(PlannerInfo *root,
 {
     NestPath   *pathnode = makeNode(NestPath);
     Relids        inner_req_outer = PATH_REQ_OUTER(inner_path);
+    Relids        outerrelids;
+
+    /*
+     * Paths are parameterized by top-level parents, so run parameterization
+     * tests on the parent relids.
+     */
+    if (outer_path->parent->top_parent_relids)
+        outerrelids = outer_path->parent->top_parent_relids;
+    else
+        outerrelids = outer_path->parent->relids;

     /*
      * If the inner path is parameterized by the outer, we must drop any
@@ -2445,7 +2457,7 @@ create_nestloop_path(PlannerInfo *root,
      * estimates for this path.  We detect such clauses by checking for serial
      * number match to clauses already enforced in the inner path.
      */
-    if (bms_overlap(inner_req_outer, outer_path->parent->relids))
+    if (bms_overlap(inner_req_outer, outerrelids))
     {
         Bitmapset  *enforced_serials = get_param_path_clause_serials(inner_path);
         List       *jclauses = NIL;
@@ -4042,6 +4054,12 @@ reparameterize_path(PlannerInfo *root, Path *path,
  *
  * Currently, only a few path types are supported here, though more could be
  * added at need.  We return NULL if we can't reparameterize the given path.
+ *
+ * Note that this function can change referenced RTEs as well as the Path
+ * structures.  Therefore, it's only safe to call during create_plan(),
+ * when we have made a final choice of which Path to use for each RTE.
+ *
+ * Keep this code in sync with path_is_reparameterizable_by_child()!
  */
 Path *
 reparameterize_path_by_child(PlannerInfo *root, Path *path,
@@ -4054,7 +4072,7 @@ reparameterize_path_by_child(PlannerInfo *root, Path *path,

 #define ADJUST_CHILD_ATTRS(node) \
     ((node) = \
-     (List *) adjust_appendrel_attrs_multilevel(root, (Node *) (node), \
+     (void *) adjust_appendrel_attrs_multilevel(root, (Node *) (node), \
                                                 child_rel, \
                                                 child_rel->top_parent))

@@ -4082,8 +4100,8 @@ do { \
     Relids        required_outer;

     /*
-     * If the path is not parameterized by parent of the given relation, it
-     * doesn't need reparameterization.
+     * If the path is not parameterized by the parent of the given relation,
+     * it doesn't need reparameterization.
      */
     if (!path->param_info ||
         !bms_overlap(PATH_REQ_OUTER(path), child_rel->top_parent_relids))
@@ -4105,6 +4123,19 @@ do { \
     {
         case T_Path:
             FLAT_COPY_PATH(new_path, path, Path);
+            if (path->pathtype == T_SampleScan)
+            {
+                Index        scan_relid = path->parent->relid;
+                RangeTblEntry *rte;
+
+                /* it should be a base rel with a tablesample clause... */
+                Assert(scan_relid > 0);
+                rte = planner_rt_fetch(scan_relid, root);
+                Assert(rte->rtekind == RTE_RELATION);
+                Assert(rte->tablesample != NULL);
+
+                ADJUST_CHILD_ATTRS(rte->tablesample);
+            }
             break;

         case T_IndexPath:
@@ -4335,9 +4366,145 @@ do { \
     return new_path;
 }

+/*
+ * path_is_reparameterizable_by_child
+ *         Given a path parameterized by the parent of the given child relation,
+ *         see if it can be translated to be parameterized by the child relation.
+ *
+ * This must return true if and only if reparameterize_path_by_child()
+ * would succeed on this path.  Currently it's sufficient to verify that
+ * the path and all of its subpaths (if any) are of the types handled by
+ * that function.  However, sub-paths that are not parameterized can be
+ * disregarded since they won't require translation.
+ */
+bool
+path_is_reparameterizable_by_child(Path *path, RelOptInfo *child_rel)
+{
+
+#define CHILD_PATH_IS_REPARAMETERIZABLE(path) \
+do { \
+    if (!path_is_reparameterizable_by_child(path, child_rel)) \
+        return false; \
+} while(0)
+
+#define CHILD_PATH_LIST_IS_REPARAMETERIZABLE(pathlist) \
+do { \
+    if (!pathlist_is_reparameterizable_by_child(pathlist, child_rel)) \
+        return false; \
+} while(0)
+
+    /*
+     * If the path is not parameterized by the parent of the given relation,
+     * it doesn't need reparameterization.
+     */
+    if (!path->param_info ||
+        !bms_overlap(PATH_REQ_OUTER(path), child_rel->top_parent_relids))
+        return true;
+
+    switch (nodeTag(path))
+    {
+        case T_Path:
+        case T_IndexPath:
+            break;
+
+        case T_BitmapHeapPath:
+            {
+                BitmapHeapPath *bhpath = (BitmapHeapPath *) path;
+
+                CHILD_PATH_IS_REPARAMETERIZABLE(bhpath->bitmapqual);
+            }
+            break;
+
+        case T_BitmapAndPath:
+            {
+                BitmapAndPath *bapath = (BitmapAndPath *) path;
+
+                CHILD_PATH_LIST_IS_REPARAMETERIZABLE(bapath->bitmapquals);
+            }
+            break;
+
+        case T_BitmapOrPath:
+            {
+                BitmapOrPath *bopath = (BitmapOrPath *) path;
+
+                CHILD_PATH_LIST_IS_REPARAMETERIZABLE(bopath->bitmapquals);
+            }
+            break;
+
+        case T_ForeignPath:
+            {
+                ForeignPath *fpath = (ForeignPath *) path;
+
+                if (fpath->fdw_outerpath)
+                    CHILD_PATH_IS_REPARAMETERIZABLE(fpath->fdw_outerpath);
+            }
+            break;
+
+        case T_CustomPath:
+            {
+                CustomPath *cpath = (CustomPath *) path;
+
+                CHILD_PATH_LIST_IS_REPARAMETERIZABLE(cpath->custom_paths);
+            }
+            break;
+
+        case T_NestPath:
+        case T_MergePath:
+        case T_HashPath:
+            {
+                JoinPath   *jpath = (JoinPath *) path;
+
+                CHILD_PATH_IS_REPARAMETERIZABLE(jpath->outerjoinpath);
+                CHILD_PATH_IS_REPARAMETERIZABLE(jpath->innerjoinpath);
+            }
+            break;
+
+        case T_AppendPath:
+            {
+                AppendPath *apath = (AppendPath *) path;
+
+                CHILD_PATH_LIST_IS_REPARAMETERIZABLE(apath->subpaths);
+            }
+            break;
+
+        case T_MaterialPath:
+            {
+                MaterialPath *mpath = (MaterialPath *) path;
+
+                CHILD_PATH_IS_REPARAMETERIZABLE(mpath->subpath);
+            }
+            break;
+
+        case T_MemoizePath:
+            {
+                MemoizePath *mpath = (MemoizePath *) path;
+
+                CHILD_PATH_IS_REPARAMETERIZABLE(mpath->subpath);
+            }
+            break;
+
+        case T_GatherPath:
+            {
+                GatherPath *gpath = (GatherPath *) path;
+
+                CHILD_PATH_IS_REPARAMETERIZABLE(gpath->subpath);
+            }
+            break;
+
+        default:
+
+            /* We don't know how to reparameterize this path. */
+            return false;
+    }
+
+    return true;
+}
+
 /*
  * reparameterize_pathlist_by_child
  *         Helper function to reparameterize a list of paths by given child rel.
+ *
+ * Returns NIL to indicate failure, so pathlist had better not be NIL.
  */
 static List *
 reparameterize_pathlist_by_child(PlannerInfo *root,
@@ -4363,3 +4530,23 @@ reparameterize_pathlist_by_child(PlannerInfo *root,

     return result;
 }
+
+/*
+ * pathlist_is_reparameterizable_by_child
+ *        Helper function to check if a list of paths can be reparameterized.
+ */
+static bool
+pathlist_is_reparameterizable_by_child(List *pathlist, RelOptInfo *child_rel)
+{
+    ListCell   *lc;
+
+    foreach(lc, pathlist)
+    {
+        Path       *path = (Path *) lfirst(lc);
+
+        if (!path_is_reparameterizable_by_child(path, child_rel))
+            return false;
+    }
+
+    return true;
+}
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
index 6e557bebc4..f5f8cbcfb4 100644
--- a/src/include/optimizer/pathnode.h
+++ b/src/include/optimizer/pathnode.h
@@ -298,6 +298,8 @@ extern Path *reparameterize_path(PlannerInfo *root, Path *path,
                                  double loop_count);
 extern Path *reparameterize_path_by_child(PlannerInfo *root, Path *path,
                                           RelOptInfo *child_rel);
+extern bool path_is_reparameterizable_by_child(Path *path,
+                                               RelOptInfo *child_rel);

 /*
  * prototypes for relnode.c
diff --git a/src/test/regress/expected/partition_join.out b/src/test/regress/expected/partition_join.out
index 6560fe2416..a11f738411 100644
--- a/src/test/regress/expected/partition_join.out
+++ b/src/test/regress/expected/partition_join.out
@@ -505,6 +505,31 @@ SELECT t1.a, ss.t2a, ss.t2c FROM prt1 t1 LEFT JOIN LATERAL
  550 |     |
 (12 rows)

+-- lateral reference in sample scan
+EXPLAIN (COSTS OFF)
+SELECT * FROM prt1 t1 JOIN LATERAL
+              (SELECT * FROM prt1 t2 TABLESAMPLE SYSTEM (t1.a) REPEATABLE(t1.b)) s
+              ON t1.a = s.a;
+                         QUERY PLAN
+-------------------------------------------------------------
+ Append
+   ->  Nested Loop
+         ->  Seq Scan on prt1_p1 t1_1
+         ->  Sample Scan on prt1_p1 t2_1
+               Sampling: system (t1_1.a) REPEATABLE (t1_1.b)
+               Filter: (t1_1.a = a)
+   ->  Nested Loop
+         ->  Seq Scan on prt1_p2 t1_2
+         ->  Sample Scan on prt1_p2 t2_2
+               Sampling: system (t1_2.a) REPEATABLE (t1_2.b)
+               Filter: (t1_2.a = a)
+   ->  Nested Loop
+         ->  Seq Scan on prt1_p3 t1_3
+         ->  Sample Scan on prt1_p3 t2_3
+               Sampling: system (t1_3.a) REPEATABLE (t1_3.b)
+               Filter: (t1_3.a = a)
+(16 rows)
+
 -- bug with inadequate sort key representation
 SET enable_partitionwise_aggregate TO true;
 SET enable_hashjoin TO false;
@@ -1944,6 +1969,41 @@ SELECT * FROM prt1_l t1 LEFT JOIN LATERAL
  550 | 0 | 0002 |     |      |     |     |
 (12 rows)

+-- partitionwise join with lateral reference in sample scan
+EXPLAIN (COSTS OFF)
+SELECT * FROM prt1_l t1 JOIN LATERAL
+              (SELECT * FROM prt1_l t2 TABLESAMPLE SYSTEM (t1.a) REPEATABLE(t1.b)) s ON
+              t1.a = s.a AND t1.b = s.b AND t1.c = s.c;
+                                       QUERY PLAN
+----------------------------------------------------------------------------------------
+ Append
+   ->  Nested Loop
+         ->  Seq Scan on prt1_l_p1 t1_1
+         ->  Sample Scan on prt1_l_p1 t2_1
+               Sampling: system (t1_1.a) REPEATABLE (t1_1.b)
+               Filter: ((t1_1.a = a) AND (t1_1.b = b) AND ((t1_1.c)::text = (c)::text))
+   ->  Nested Loop
+         ->  Seq Scan on prt1_l_p2_p1 t1_2
+         ->  Sample Scan on prt1_l_p2_p1 t2_2
+               Sampling: system (t1_2.a) REPEATABLE (t1_2.b)
+               Filter: ((t1_2.a = a) AND (t1_2.b = b) AND ((t1_2.c)::text = (c)::text))
+   ->  Nested Loop
+         ->  Seq Scan on prt1_l_p2_p2 t1_3
+         ->  Sample Scan on prt1_l_p2_p2 t2_3
+               Sampling: system (t1_3.a) REPEATABLE (t1_3.b)
+               Filter: ((t1_3.a = a) AND (t1_3.b = b) AND ((t1_3.c)::text = (c)::text))
+   ->  Nested Loop
+         ->  Seq Scan on prt1_l_p3_p1 t1_4
+         ->  Sample Scan on prt1_l_p3_p1 t2_4
+               Sampling: system (t1_4.a) REPEATABLE (t1_4.b)
+               Filter: ((t1_4.a = a) AND (t1_4.b = b) AND ((t1_4.c)::text = (c)::text))
+   ->  Nested Loop
+         ->  Seq Scan on prt1_l_p3_p2 t1_5
+         ->  Sample Scan on prt1_l_p3_p2 t2_5
+               Sampling: system (t1_5.a) REPEATABLE (t1_5.b)
+               Filter: ((t1_5.a = a) AND (t1_5.b = b) AND ((t1_5.c)::text = (c)::text))
+(26 rows)
+
 -- join with one side empty
 EXPLAIN (COSTS OFF)
 SELECT t1.a, t1.c, t2.b, t2.c FROM (SELECT * FROM prt1_l WHERE a = 1 AND a = 2) t1 RIGHT JOIN prt2_l t2 ON t1.a = t2.b
ANDt1.b = t2.a AND t1.c = t2.c; 
diff --git a/src/test/regress/sql/partition_join.sql b/src/test/regress/sql/partition_join.sql
index 48daf3aee3..e2daab03fb 100644
--- a/src/test/regress/sql/partition_join.sql
+++ b/src/test/regress/sql/partition_join.sql
@@ -100,6 +100,12 @@ SELECT t1.a, ss.t2a, ss.t2c FROM prt1 t1 LEFT JOIN LATERAL
               (SELECT t2.a AS t2a, t3.a AS t3a, t2.b t2b, t2.c t2c, least(t1.a,t2.a,t3.a) FROM prt1 t2 JOIN prt2 t3 ON
(t2.a= t3.b)) ss 
               ON t1.c = ss.t2c WHERE (t1.b + coalesce(ss.t2b, 0)) = 0 ORDER BY t1.a;

+-- lateral reference in sample scan
+EXPLAIN (COSTS OFF)
+SELECT * FROM prt1 t1 JOIN LATERAL
+              (SELECT * FROM prt1 t2 TABLESAMPLE SYSTEM (t1.a) REPEATABLE(t1.b)) s
+              ON t1.a = s.a;
+
 -- bug with inadequate sort key representation
 SET enable_partitionwise_aggregate TO true;
 SET enable_hashjoin TO false;
@@ -387,6 +393,12 @@ SELECT * FROM prt1_l t1 LEFT JOIN LATERAL
               (SELECT t2.a AS t2a, t2.c AS t2c, t2.b AS t2b, t3.b AS t3b, least(t1.a,t2.a,t3.b) FROM prt1_l t2 JOIN
prt2_lt3 ON (t2.a = t3.b AND t2.c = t3.c)) ss 
               ON t1.a = ss.t2a AND t1.c = ss.t2c WHERE t1.b = 0 ORDER BY t1.a;

+-- partitionwise join with lateral reference in sample scan
+EXPLAIN (COSTS OFF)
+SELECT * FROM prt1_l t1 JOIN LATERAL
+              (SELECT * FROM prt1_l t2 TABLESAMPLE SYSTEM (t1.a) REPEATABLE(t1.b)) s ON
+              t1.a = s.a AND t1.b = s.b AND t1.c = s.c;
+
 -- join with one side empty
 EXPLAIN (COSTS OFF)
 SELECT t1.a, t1.c, t2.b, t2.c FROM (SELECT * FROM prt1_l WHERE a = 1 AND a = 2) t1 RIGHT JOIN prt2_l t2 ON t1.a = t2.b
ANDt1.b = t2.a AND t1.c = t2.c; 
diff -U3 /home/postgres/REL_15/pgsql/src/test/regress/expected/partition_join.out
/home/postgres/REL_15/pgsql/src/test/regress/results/partition_join.out
--- /home/postgres/REL_15/pgsql/src/test/regress/expected/partition_join.out    2023-08-21 15:00:46.578980354 -0400
+++ /home/postgres/REL_15/pgsql/src/test/regress/results/partition_join.out    2023-08-21 15:12:41.109495743 -0400
@@ -123,11 +123,12 @@
                      ->  Seq Scan on prt2_p2 t2_2
                            Filter: (a = 0)
          ->  Nested Loop Left Join
+               Join Filter: (t1_3.a = t2_3.b)
                ->  Seq Scan on prt2_p3 t2_3
                      Filter: (a = 0)
                ->  Index Scan using iprt1_p3_a on prt1_p3 t1_3
                      Index Cond: (a = t2_3.b)
-(20 rows)
+(21 rows)

 SELECT t1.a, t1.c, t2.b, t2.c FROM prt1 t1 RIGHT JOIN prt2 t2 ON t1.a = t2.b WHERE t2.a = 0 ORDER BY t1.a, t2.b;
   a  |  c   |  b  |  c
@@ -362,30 +363,36 @@
    Sort Key: t1.a
    ->  Append
          ->  Nested Loop Left Join
+               Join Filter: (t1_1.a = t2_1.a)
                ->  Seq Scan on prt1_p1 t1_1
                      Filter: (b = 0)
                ->  Nested Loop
+                     Join Filter: (t2_1.a = t3_1.b)
                      ->  Index Only Scan using iprt1_p1_a on prt1_p1 t2_1
                            Index Cond: (a = t1_1.a)
                      ->  Index Scan using iprt2_p1_b on prt2_p1 t3_1
                            Index Cond: (b = t2_1.a)
          ->  Nested Loop Left Join
+               Join Filter: (t1_2.a = t2_2.a)
                ->  Seq Scan on prt1_p2 t1_2
                      Filter: (b = 0)
                ->  Nested Loop
+                     Join Filter: (t2_2.a = t3_2.b)
                      ->  Index Only Scan using iprt1_p2_a on prt1_p2 t2_2
                            Index Cond: (a = t1_2.a)
                      ->  Index Scan using iprt2_p2_b on prt2_p2 t3_2
                            Index Cond: (b = t2_2.a)
          ->  Nested Loop Left Join
+               Join Filter: (t1_3.a = t2_3.a)
                ->  Seq Scan on prt1_p3 t1_3
                      Filter: (b = 0)
                ->  Nested Loop
+                     Join Filter: (t2_3.a = t3_3.b)
                      ->  Index Only Scan using iprt1_p3_a on prt1_p3 t2_3
                            Index Cond: (a = t1_3.a)
                      ->  Index Scan using iprt2_p3_b on prt2_p3 t3_3
                            Index Cond: (b = t2_3.a)
-(27 rows)
+(33 rows)

 SELECT * FROM prt1 t1 LEFT JOIN LATERAL
               (SELECT t2.a AS t2a, t3.a AS t3a, least(t1.a,t2.a,t3.b) FROM prt1 t2 JOIN prt2 t3 ON (t2.a = t3.b)) ss
@@ -468,21 +475,24 @@
 -------------------------------------------------------------
  Append
    ->  Nested Loop
+         Join Filter: (t1_1.a = t2_1.a)
          ->  Seq Scan on prt1_p1 t1_1
          ->  Sample Scan on prt1_p1 t2_1
                Sampling: system (t1_1.a) REPEATABLE (t1_1.b)
                Filter: (t1_1.a = a)
    ->  Nested Loop
+         Join Filter: (t1_2.a = t2_2.a)
          ->  Seq Scan on prt1_p2 t1_2
          ->  Sample Scan on prt1_p2 t2_2
                Sampling: system (t1_2.a) REPEATABLE (t1_2.b)
                Filter: (t1_2.a = a)
    ->  Nested Loop
+         Join Filter: (t1_3.a = t2_3.a)
          ->  Seq Scan on prt1_p3 t1_3
          ->  Sample Scan on prt1_p3 t2_3
                Sampling: system (t1_3.a) REPEATABLE (t1_3.b)
                Filter: (t1_3.a = a)
-(16 rows)
+(19 rows)

 -- bug with inadequate sort key representation
 SET enable_partitionwise_aggregate TO true;
@@ -730,6 +740,7 @@
    Sort Key: t1.a, t2.b, ((t3.a + t3.b))
    ->  Append
          ->  Nested Loop Left Join
+               Join Filter: (t1_1.a = t2_1.b)
                ->  Hash Right Join
                      Hash Cond: (t1_1.a = ((t3_1.a + t3_1.b) / 2))
                      ->  Seq Scan on prt1_p1 t1_1
@@ -739,6 +750,7 @@
                ->  Index Scan using iprt2_p1_b on prt2_p1 t2_1
                      Index Cond: (b = t1_1.a)
          ->  Nested Loop Left Join
+               Join Filter: (t1_2.a = t2_2.b)
                ->  Hash Right Join
                      Hash Cond: (t1_2.a = ((t3_2.a + t3_2.b) / 2))
                      ->  Seq Scan on prt1_p2 t1_2
@@ -748,6 +760,7 @@
                ->  Index Scan using iprt2_p2_b on prt2_p2 t2_2
                      Index Cond: (b = t1_2.a)
          ->  Nested Loop Left Join
+               Join Filter: (t1_3.a = t2_3.b)
                ->  Hash Right Join
                      Hash Cond: (t1_3.a = ((t3_3.a + t3_3.b) / 2))
                      ->  Seq Scan on prt1_p3 t1_3
@@ -756,7 +769,7 @@
                                  Filter: (c = 0)
                ->  Index Scan using iprt2_p3_b on prt2_p3 t2_3
                      Index Cond: (b = t1_3.a)
-(30 rows)
+(33 rows)

 SELECT t1.a, t1.c, t2.b, t2.c, t3.a + t3.b, t3.c FROM (prt1 t1 LEFT JOIN prt2 t2 ON t1.a = t2.b) RIGHT JOIN prt1_e t3
ON(t1.a = (t3.a + t3.b)/2) WHERE t3.c = 0 ORDER BY t1.a, t2.b, t3.a + t3.b; 
   a  |  c   |  b  |  c   | ?column? | c
@@ -981,6 +994,7 @@
                ->  HashAggregate
                      Group Key: t1_7.b
                      ->  Nested Loop
+                           Join Filter: (t1_7.b = ((t2_3.a + t2_3.b) / 2))
                            ->  Seq Scan on prt2_p3 t1_7
                                  Filter: (a = 0)
                            ->  Index Scan using iprt1_e_p3_ab2 on prt1_e_p3 t2_3
@@ -988,7 +1002,7 @@
                ->  Index Scan using iprt1_p3_a on prt1_p3 t1_4
                      Index Cond: (a = ((t2_3.a + t2_3.b) / 2))
                      Filter: (b = 0)
-(41 rows)
+(42 rows)

 SELECT t1.* FROM prt1 t1 WHERE t1.a IN (SELECT t1.b FROM prt2 t1, prt1_e t2 WHERE t1.a = 0 AND t1.b = (t2.a + t2.b)/2)
ANDt1.b = 0 ORDER BY t1.a; 
   a  | b |  c
@@ -1007,6 +1021,7 @@
    Sort Key: t1.a
    ->  Append
          ->  Nested Loop
+               Join Filter: (t1_3.a = t1_6.b)
                ->  HashAggregate
                      Group Key: t1_6.b
                      ->  Hash Semi Join
@@ -1019,6 +1034,7 @@
                      Index Cond: (a = t1_6.b)
                      Filter: (b = 0)
          ->  Nested Loop
+               Join Filter: (t1_4.a = t1_7.b)
                ->  HashAggregate
                      Group Key: t1_7.b
                      ->  Hash Semi Join
@@ -1031,6 +1047,7 @@
                      Index Cond: (a = t1_7.b)
                      Filter: (b = 0)
          ->  Nested Loop
+               Join Filter: (t1_5.a = t1_8.b)
                ->  HashAggregate
                      Group Key: t1_8.b
                      ->  Hash Semi Join
@@ -1042,7 +1059,7 @@
                ->  Index Scan using iprt1_p3_a on prt1_p3 t1_5
                      Index Cond: (a = t1_8.b)
                      Filter: (b = 0)
-(39 rows)
+(42 rows)

 SELECT t1.* FROM prt1 t1 WHERE t1.a IN (SELECT t1.b FROM prt2 t1 WHERE t1.b IN (SELECT (t1.a + t1.b)/2 FROM prt1_e t1
WHEREt1.c = 0)) AND t1.b = 0 ORDER BY t1.a; 
   a  | b |  c
@@ -1861,6 +1878,7 @@
    Sort Key: t1.a
    ->  Append
          ->  Nested Loop Left Join
+               Join Filter: ((t1_1.a = t2_1.a) AND ((t1_1.c)::text = (t2_1.c)::text))
                ->  Seq Scan on prt1_l_p1 t1_1
                      Filter: (b = 0)
                ->  Hash Join
@@ -1870,6 +1888,7 @@
                            ->  Seq Scan on prt1_l_p1 t2_1
                                  Filter: ((t1_1.a = a) AND ((t1_1.c)::text = (c)::text))
          ->  Nested Loop Left Join
+               Join Filter: ((t1_2.a = t2_2.a) AND ((t1_2.c)::text = (t2_2.c)::text))
                ->  Seq Scan on prt1_l_p2_p1 t1_2
                      Filter: (b = 0)
                ->  Hash Join
@@ -1879,6 +1898,7 @@
                            ->  Seq Scan on prt1_l_p2_p1 t2_2
                                  Filter: ((t1_2.a = a) AND ((t1_2.c)::text = (c)::text))
          ->  Nested Loop Left Join
+               Join Filter: ((t1_3.a = t2_3.a) AND ((t1_3.c)::text = (t2_3.c)::text))
                ->  Seq Scan on prt1_l_p2_p2 t1_3
                      Filter: (b = 0)
                ->  Hash Join
@@ -1888,6 +1908,7 @@
                            ->  Seq Scan on prt1_l_p2_p2 t2_3
                                  Filter: ((t1_3.a = a) AND ((t1_3.c)::text = (c)::text))
          ->  Nested Loop Left Join
+               Join Filter: ((t1_4.a = t2_5.a) AND ((t1_4.c)::text = (t2_5.c)::text))
                ->  Seq Scan on prt1_l_p3_p1 t1_4
                      Filter: (b = 0)
                ->  Hash Join
@@ -1901,7 +1922,7 @@
                                        Filter: ((t1_4.a = a) AND ((t1_4.c)::text = (c)::text))
                                  ->  Seq Scan on prt1_l_p3_p2 t2_6
                                        Filter: ((t1_4.a = a) AND ((t1_4.c)::text = (c)::text))
-(44 rows)
+(48 rows)

 SELECT * FROM prt1_l t1 LEFT JOIN LATERAL
               (SELECT t2.a AS t2a, t2.c AS t2c, t2.b AS t2b, t3.b AS t3b, least(t1.a,t2.a,t3.b) FROM prt1_l t2 JOIN
prt2_lt3 ON (t2.a = t3.b AND t2.c = t3.c)) ss 
@@ -1927,35 +1948,40 @@
 SELECT * FROM prt1_l t1 JOIN LATERAL
               (SELECT * FROM prt1_l t2 TABLESAMPLE SYSTEM (t1.a) REPEATABLE(t1.b)) s ON
               t1.a = s.a AND t1.b = s.b AND t1.c = s.c;
-                                       QUERY PLAN
-----------------------------------------------------------------------------------------
+                                              QUERY PLAN
+------------------------------------------------------------------------------------------------------
  Append
    ->  Nested Loop
+         Join Filter: ((t1_1.a = t2_1.a) AND (t1_1.b = t2_1.b) AND ((t1_1.c)::text = (t2_1.c)::text))
          ->  Seq Scan on prt1_l_p1 t1_1
          ->  Sample Scan on prt1_l_p1 t2_1
                Sampling: system (t1_1.a) REPEATABLE (t1_1.b)
                Filter: ((t1_1.a = a) AND (t1_1.b = b) AND ((t1_1.c)::text = (c)::text))
    ->  Nested Loop
+         Join Filter: ((t1_2.a = t2_2.a) AND (t1_2.b = t2_2.b) AND ((t1_2.c)::text = (t2_2.c)::text))
          ->  Seq Scan on prt1_l_p2_p1 t1_2
          ->  Sample Scan on prt1_l_p2_p1 t2_2
                Sampling: system (t1_2.a) REPEATABLE (t1_2.b)
                Filter: ((t1_2.a = a) AND (t1_2.b = b) AND ((t1_2.c)::text = (c)::text))
    ->  Nested Loop
+         Join Filter: ((t1_3.a = t2_3.a) AND (t1_3.b = t2_3.b) AND ((t1_3.c)::text = (t2_3.c)::text))
          ->  Seq Scan on prt1_l_p2_p2 t1_3
          ->  Sample Scan on prt1_l_p2_p2 t2_3
                Sampling: system (t1_3.a) REPEATABLE (t1_3.b)
                Filter: ((t1_3.a = a) AND (t1_3.b = b) AND ((t1_3.c)::text = (c)::text))
    ->  Nested Loop
+         Join Filter: ((t1_4.a = t2_4.a) AND (t1_4.b = t2_4.b) AND ((t1_4.c)::text = (t2_4.c)::text))
          ->  Seq Scan on prt1_l_p3_p1 t1_4
          ->  Sample Scan on prt1_l_p3_p1 t2_4
                Sampling: system (t1_4.a) REPEATABLE (t1_4.b)
                Filter: ((t1_4.a = a) AND (t1_4.b = b) AND ((t1_4.c)::text = (c)::text))
    ->  Nested Loop
+         Join Filter: ((t1_5.a = t2_5.a) AND (t1_5.b = t2_5.b) AND ((t1_5.c)::text = (t2_5.c)::text))
          ->  Seq Scan on prt1_l_p3_p2 t1_5
          ->  Sample Scan on prt1_l_p3_p2 t2_5
                Sampling: system (t1_5.a) REPEATABLE (t1_5.b)
                Filter: ((t1_5.a = a) AND (t1_5.b = b) AND ((t1_5.c)::text = (c)::text))
-(26 rows)
+(31 rows)

 -- join with one side empty
 EXPLAIN (COSTS OFF)
@@ -2242,10 +2268,11 @@
 where not exists (select 1 from prtx2
                   where prtx2.a=prtx1.a and prtx2.b=prtx1.b and prtx2.c=123)
   and a<20 and c=120;
-                         QUERY PLAN
--------------------------------------------------------------
+                                 QUERY PLAN
+----------------------------------------------------------------------------
  Append
    ->  Nested Loop Anti Join
+         Join Filter: ((prtx2_1.a = prtx1_1.a) AND (prtx2_1.b = prtx1_1.b))
          ->  Seq Scan on prtx1_1
                Filter: ((a < 20) AND (c = 120))
          ->  Bitmap Heap Scan on prtx2_1
@@ -2257,6 +2284,7 @@
                      ->  Bitmap Index Scan on prtx2_1_c_idx
                            Index Cond: (c = 123)
    ->  Nested Loop Anti Join
+         Join Filter: ((prtx2_2.a = prtx1_2.a) AND (prtx2_2.b = prtx1_2.b))
          ->  Seq Scan on prtx1_2
                Filter: ((a < 20) AND (c = 120))
          ->  Bitmap Heap Scan on prtx2_2
@@ -2267,7 +2295,7 @@
                            Index Cond: (b = prtx1_2.b)
                      ->  Bitmap Index Scan on prtx2_2_c_idx
                            Index Cond: (c = 123)
-(23 rows)
+(25 rows)

 select * from prtx1
 where not exists (select 1 from prtx2
@@ -2283,10 +2311,11 @@
 where not exists (select 1 from prtx2
                   where prtx2.a=prtx1.a and (prtx2.b=prtx1.b+1 or prtx2.c=99))
   and a<20 and c=91;
-                           QUERY PLAN
------------------------------------------------------------------
+                                               QUERY PLAN
+--------------------------------------------------------------------------------------------------------
  Append
    ->  Nested Loop Anti Join
+         Join Filter: ((prtx2_1.a = prtx1_1.a) AND ((prtx2_1.b = (prtx1_1.b + 1)) OR (prtx2_1.c = 99)))
          ->  Seq Scan on prtx1_1
                Filter: ((a < 20) AND (c = 91))
          ->  Bitmap Heap Scan on prtx2_1
@@ -2298,6 +2327,7 @@
                      ->  Bitmap Index Scan on prtx2_1_c_idx
                            Index Cond: (c = 99)
    ->  Nested Loop Anti Join
+         Join Filter: ((prtx2_2.a = prtx1_2.a) AND ((prtx2_2.b = (prtx1_2.b + 1)) OR (prtx2_2.c = 99)))
          ->  Seq Scan on prtx1_2
                Filter: ((a < 20) AND (c = 91))
          ->  Bitmap Heap Scan on prtx2_2
@@ -2308,7 +2338,7 @@
                            Index Cond: (b = (prtx1_2.b + 1))
                      ->  Bitmap Index Scan on prtx2_2_c_idx
                            Index Cond: (c = 99)
-(23 rows)
+(25 rows)

 select * from prtx1
 where not exists (select 1 from prtx2
@@ -2803,7 +2833,9 @@
    Sort Key: t1.b, t2.a
    ->  Append
          ->  Nested Loop Left Join
+               Join Filter: (t1_1.b = t2_1.a)
                ->  Nested Loop
+                     Join Filter: (t1_1.b = t3_1.a)
                      ->  Seq Scan on prt2_adv_p1 t1_1
                            Filter: (a = 0)
                      ->  Index Scan using prt1_adv_p1_a_idx on prt1_adv_p1 t3_1
@@ -2830,7 +2862,7 @@
                            ->  Hash
                                  ->  Seq Scan on prt2_adv_p3 t1_3
                                        Filter: (a = 0)
-(31 rows)
+(33 rows)

 SELECT t1.b, t1.c, t2.a, t2.c, t3.a, t3.c FROM prt2_adv t1 LEFT JOIN prt1_adv t2 ON (t1.b = t2.a) INNER JOIN prt1_adv
t3ON (t1.b = t3.a) WHERE t1.a = 0 ORDER BY t1.b, t2.a, t3.a; 
   b  |  c   |  a  |  c   |  a  |  c
@@ -4957,14 +4989,16 @@
    ->  Merge Append
          Sort Key: x.id DESC
          ->  Nested Loop Left Join
+               Join Filter: (x_1.id = y_1.id)
                ->  Index Only Scan Backward using fract_t0_pkey on fract_t0 x_1
                ->  Index Only Scan using fract_t0_pkey on fract_t0 y_1
                      Index Cond: (id = x_1.id)
          ->  Nested Loop Left Join
+               Join Filter: (x_2.id = y_2.id)
                ->  Index Only Scan Backward using fract_t1_pkey on fract_t1 x_2
                ->  Index Only Scan using fract_t1_pkey on fract_t1 y_2
                      Index Cond: (id = x_2.id)
-(11 rows)
+(13 rows)

 -- cleanup
 DROP TABLE fract_t;
diff -U3 /home/postgres/REL_15/pgsql/src/test/regress/expected/memoize.out
/home/postgres/REL_15/pgsql/src/test/regress/results/memoize.out
--- /home/postgres/REL_15/pgsql/src/test/regress/expected/memoize.out    2023-01-24 11:53:26.646912951 -0500
+++ /home/postgres/REL_15/pgsql/src/test/regress/results/memoize.out    2023-08-21 15:12:40.813495545 -0400
@@ -215,6 +215,7 @@
 ------------------------------------------------------------------------------------------
  Append (actual rows=32 loops=N)
    ->  Nested Loop (actual rows=16 loops=N)
+         Join Filter: (t1_1.a = t2_1.a)
          ->  Index Only Scan using iprt_p1_a on prt_p1 t1_1 (actual rows=4 loops=N)
                Heap Fetches: N
          ->  Memoize (actual rows=4 loops=N)
@@ -225,6 +226,7 @@
                      Index Cond: (a = t1_1.a)
                      Heap Fetches: N
    ->  Nested Loop (actual rows=16 loops=N)
+         Join Filter: (t1_2.a = t2_2.a)
          ->  Index Only Scan using iprt_p2_a on prt_p2 t1_2 (actual rows=4 loops=N)
                Heap Fetches: N
          ->  Memoize (actual rows=4 loops=N)
@@ -234,7 +236,7 @@
                ->  Index Only Scan using iprt_p2_a on prt_p2 t2_2 (actual rows=4 loops=N)
                      Index Cond: (a = t1_2.a)
                      Heap Fetches: N
-(21 rows)
+(23 rows)

 DROP TABLE prt;
 RESET enable_partitionwise_join;

Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Richard Guo
Дата:

On Tue, Aug 22, 2023 at 4:48 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
I spent some time reviewing the v4 patch.  I noted that
path_is_reparameterizable_by_child still wasn't modeling the pass/fail
behavior of reparameterize_path_by_child very well, because that
function checks this at every recursion level:

    /*
     * If the path is not parameterized by the parent of the given relation,
     * it doesn't need reparameterization.
     */
    if (!path->param_info ||
        !bms_overlap(PATH_REQ_OUTER(path), child_rel->top_parent_relids))
        return path;

So we might have a sub-path (eg a join input rel) that is not one of
the supported kinds, and yet we can succeed because parameterization
appears only in other sub-paths.  The patch as written will not crash
in such a case, but it might refuse to use a path that we previously
would have allowed.  So I think we need to put the same test into
path_is_reparameterizable_by_child, which requires adding child_rel
to its param list but is otherwise simple enough.

sigh.. I overlooked this check.  You're right.  We have to have this
same test in path_is_reparameterizable_by_child.
 
So that led me to the attached v5, which seemed committable to me so I
set about back-patching it ... and it fell over immediately in v15, as
shown in the attached regression diffs from v15.  It looks to me like
we are now failing to recognize that reparameterized quals are
redundant with not-reparameterized ones, so this patch is somehow
dependent on restructuring that happened during the v16 cycle.
I don't have time to dig deeper than that, and I'm not sure that that
is an area we'd want to mess with in a back-patched bug fix.

I looked at this and I think the culprit is that in create_nestloop_path
we are failing to recognize those restrict_clauses that have been moved
into the inner path.  In v16, we have the new serial number stuff to
help detect such clauses and that works very well.  In v15, we are still
using join_clause_is_movable_into() for that purpose.  It does not work
well with the patch because now in create_nestloop_path the inner path
is still parameterized by top-level parents, while the restrict_clauses
already have been adjusted to refer to the child rels.  So the check
performed by join_clause_is_movable_into() would always fail.

I'm wondering if we can instead adjust the 'inner_req_outer' in
create_nestloop_path before we perform the check to work around this
issue for the back branches, something like

@@ -2418,6 +2419,15 @@ create_nestloop_path(PlannerInfo *root,
    NestPath   *pathnode = makeNode(NestPath);
    Relids      inner_req_outer = PATH_REQ_OUTER(inner_path);

+   /*
+    * Adjust the parameterization information, which refers to the topmost
+    * parent.
+    */
+   inner_req_outer =
+       adjust_child_relids_multilevel(root, inner_req_outer,
+                                      outer_path->parent->relids,
+                                      outer_path->parent->top_parent_relids);
+

And with it we do not need the changes as in the patch for
create_nestloop_path any more.

Thanks
Richard

Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Tom Lane
Дата:
Richard Guo <guofenglinux@gmail.com> writes:
> On Tue, Aug 22, 2023 at 4:48 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> So that led me to the attached v5, which seemed committable to me so I
>> set about back-patching it ... and it fell over immediately in v15, as
>> shown in the attached regression diffs from v15.  It looks to me like
>> we are now failing to recognize that reparameterized quals are
>> redundant with not-reparameterized ones, so this patch is somehow
>> dependent on restructuring that happened during the v16 cycle.
>> I don't have time to dig deeper than that, and I'm not sure that that
>> is an area we'd want to mess with in a back-patched bug fix.

> I looked at this and I think the culprit is that in create_nestloop_path
> we are failing to recognize those restrict_clauses that have been moved
> into the inner path.  In v16, we have the new serial number stuff to
> help detect such clauses and that works very well.  In v15, we are still
> using join_clause_is_movable_into() for that purpose.  It does not work
> well with the patch because now in create_nestloop_path the inner path
> is still parameterized by top-level parents, while the restrict_clauses
> already have been adjusted to refer to the child rels.  So the check
> performed by join_clause_is_movable_into() would always fail.

Ah.

> I'm wondering if we can instead adjust the 'inner_req_outer' in
> create_nestloop_path before we perform the check to work around this
> issue for the back branches, something like
> +   /*
> +    * Adjust the parameterization information, which refers to the topmost
> +    * parent.
> +    */
> +   inner_req_outer =
> +       adjust_child_relids_multilevel(root, inner_req_outer,
> +                                      outer_path->parent->relids,
> +                                      outer_path->parent->top_parent_relids);

Mmm ... at the very least you'd need to not do that when top_parent_relids
is unset.  But I think the risk/reward ratio for messing with this in the
back branches is unattractive in any case: to fix a corner case that
apparently nobody uses in the field, we risk breaking any number of
mainstream parameterized-path cases.  I'm content to commit the v5 patch
(or a successor) into HEAD, but at this point I'm not sure I even want
to risk it in v16, let alone perform delicate surgery to get it to work
in older branches.  I think we ought to go with the "tablesample scans
can't be reparameterized" approach in v16 and before.

            regards, tom lane



Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Tom Lane
Дата:
I wrote:
> ... I think the risk/reward ratio for messing with this in the
> back branches is unattractive in any case: to fix a corner case that
> apparently nobody uses in the field, we risk breaking any number of
> mainstream parameterized-path cases.  I'm content to commit the v5 patch
> (or a successor) into HEAD, but at this point I'm not sure I even want
> to risk it in v16, let alone perform delicate surgery to get it to work
> in older branches.  I think we ought to go with the "tablesample scans
> can't be reparameterized" approach in v16 and before.

Concretely, about like this for v16, and similarly in older branches.

            regards, tom lane

diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index 5f5596841c..c4ec6ed5e6 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -4087,6 +4087,28 @@ do { \
     switch (nodeTag(path))
     {
         case T_Path:
+
+            /*
+             * If it's a SampleScan with tablesample parameters referencing
+             * the other relation, we can't reparameterize, because we must
+             * not change the RTE's contents here.  (Doing so would break
+             * things if we end up using a non-partitionwise join.)
+             */
+            if (path->pathtype == T_SampleScan)
+            {
+                Index        scan_relid = path->parent->relid;
+                RangeTblEntry *rte;
+
+                /* it should be a base rel with a tablesample clause... */
+                Assert(scan_relid > 0);
+                rte = planner_rt_fetch(scan_relid, root);
+                Assert(rte->rtekind == RTE_RELATION);
+                Assert(rte->tablesample != NULL);
+
+                if (bms_overlap(pull_varnos(root, (Node *) rte->tablesample),
+                                child_rel->top_parent_relids))
+                    return NULL;
+            }
             FLAT_COPY_PATH(new_path, path, Path);
             break;

diff --git a/src/test/regress/expected/partition_join.out b/src/test/regress/expected/partition_join.out
index 6560fe2416..a69a8a70f3 100644
--- a/src/test/regress/expected/partition_join.out
+++ b/src/test/regress/expected/partition_join.out
@@ -505,6 +505,32 @@ SELECT t1.a, ss.t2a, ss.t2c FROM prt1 t1 LEFT JOIN LATERAL
  550 |     |
 (12 rows)

+-- lateral reference in sample scan
+SET max_parallel_workers_per_gather = 0;
+EXPLAIN (COSTS OFF)
+SELECT * FROM prt1 t1 JOIN LATERAL
+              (SELECT * FROM prt1 t2 TABLESAMPLE SYSTEM (t1.a) REPEATABLE(t1.b)) s
+              ON t1.a = s.a;
+                       QUERY PLAN
+---------------------------------------------------------
+ Nested Loop
+   ->  Append
+         ->  Seq Scan on prt1_p1 t1_1
+         ->  Seq Scan on prt1_p2 t1_2
+         ->  Seq Scan on prt1_p3 t1_3
+   ->  Append
+         ->  Sample Scan on prt1_p1 t2_1
+               Sampling: system (t1.a) REPEATABLE (t1.b)
+               Filter: (t1.a = a)
+         ->  Sample Scan on prt1_p2 t2_2
+               Sampling: system (t1.a) REPEATABLE (t1.b)
+               Filter: (t1.a = a)
+         ->  Sample Scan on prt1_p3 t2_3
+               Sampling: system (t1.a) REPEATABLE (t1.b)
+               Filter: (t1.a = a)
+(15 rows)
+
+RESET max_parallel_workers_per_gather;
 -- bug with inadequate sort key representation
 SET enable_partitionwise_aggregate TO true;
 SET enable_hashjoin TO false;
@@ -1944,6 +1970,40 @@ SELECT * FROM prt1_l t1 LEFT JOIN LATERAL
  550 | 0 | 0002 |     |      |     |     |
 (12 rows)

+-- partitionwise join with lateral reference in sample scan
+SET max_parallel_workers_per_gather = 0;
+EXPLAIN (COSTS OFF)
+SELECT * FROM prt1_l t1 JOIN LATERAL
+              (SELECT * FROM prt1_l t2 TABLESAMPLE SYSTEM (t1.a) REPEATABLE(t1.b)) s ON
+              t1.a = s.a AND t1.b = s.b AND t1.c = s.c;
+                                    QUERY PLAN
+----------------------------------------------------------------------------------
+ Nested Loop
+   ->  Append
+         ->  Seq Scan on prt1_l_p1 t1_1
+         ->  Seq Scan on prt1_l_p2_p1 t1_2
+         ->  Seq Scan on prt1_l_p2_p2 t1_3
+         ->  Seq Scan on prt1_l_p3_p1 t1_4
+         ->  Seq Scan on prt1_l_p3_p2 t1_5
+   ->  Append
+         ->  Sample Scan on prt1_l_p1 t2_1
+               Sampling: system (t1.a) REPEATABLE (t1.b)
+               Filter: ((t1.a = a) AND (t1.b = b) AND ((t1.c)::text = (c)::text))
+         ->  Sample Scan on prt1_l_p2_p1 t2_2
+               Sampling: system (t1.a) REPEATABLE (t1.b)
+               Filter: ((t1.a = a) AND (t1.b = b) AND ((t1.c)::text = (c)::text))
+         ->  Sample Scan on prt1_l_p2_p2 t2_3
+               Sampling: system (t1.a) REPEATABLE (t1.b)
+               Filter: ((t1.a = a) AND (t1.b = b) AND ((t1.c)::text = (c)::text))
+         ->  Sample Scan on prt1_l_p3_p1 t2_4
+               Sampling: system (t1.a) REPEATABLE (t1.b)
+               Filter: ((t1.a = a) AND (t1.b = b) AND ((t1.c)::text = (c)::text))
+         ->  Sample Scan on prt1_l_p3_p2 t2_5
+               Sampling: system (t1.a) REPEATABLE (t1.b)
+               Filter: ((t1.a = a) AND (t1.b = b) AND ((t1.c)::text = (c)::text))
+(23 rows)
+
+RESET max_parallel_workers_per_gather;
 -- join with one side empty
 EXPLAIN (COSTS OFF)
 SELECT t1.a, t1.c, t2.b, t2.c FROM (SELECT * FROM prt1_l WHERE a = 1 AND a = 2) t1 RIGHT JOIN prt2_l t2 ON t1.a = t2.b
ANDt1.b = t2.a AND t1.c = t2.c; 
diff --git a/src/test/regress/sql/partition_join.sql b/src/test/regress/sql/partition_join.sql
index 48daf3aee3..d28248b42d 100644
--- a/src/test/regress/sql/partition_join.sql
+++ b/src/test/regress/sql/partition_join.sql
@@ -100,6 +100,14 @@ SELECT t1.a, ss.t2a, ss.t2c FROM prt1 t1 LEFT JOIN LATERAL
               (SELECT t2.a AS t2a, t3.a AS t3a, t2.b t2b, t2.c t2c, least(t1.a,t2.a,t3.a) FROM prt1 t2 JOIN prt2 t3 ON
(t2.a= t3.b)) ss 
               ON t1.c = ss.t2c WHERE (t1.b + coalesce(ss.t2b, 0)) = 0 ORDER BY t1.a;

+-- lateral reference in sample scan
+SET max_parallel_workers_per_gather = 0;
+EXPLAIN (COSTS OFF)
+SELECT * FROM prt1 t1 JOIN LATERAL
+              (SELECT * FROM prt1 t2 TABLESAMPLE SYSTEM (t1.a) REPEATABLE(t1.b)) s
+              ON t1.a = s.a;
+RESET max_parallel_workers_per_gather;
+
 -- bug with inadequate sort key representation
 SET enable_partitionwise_aggregate TO true;
 SET enable_hashjoin TO false;
@@ -387,6 +395,14 @@ SELECT * FROM prt1_l t1 LEFT JOIN LATERAL
               (SELECT t2.a AS t2a, t2.c AS t2c, t2.b AS t2b, t3.b AS t3b, least(t1.a,t2.a,t3.b) FROM prt1_l t2 JOIN
prt2_lt3 ON (t2.a = t3.b AND t2.c = t3.c)) ss 
               ON t1.a = ss.t2a AND t1.c = ss.t2c WHERE t1.b = 0 ORDER BY t1.a;

+-- partitionwise join with lateral reference in sample scan
+SET max_parallel_workers_per_gather = 0;
+EXPLAIN (COSTS OFF)
+SELECT * FROM prt1_l t1 JOIN LATERAL
+              (SELECT * FROM prt1_l t2 TABLESAMPLE SYSTEM (t1.a) REPEATABLE(t1.b)) s ON
+              t1.a = s.a AND t1.b = s.b AND t1.c = s.c;
+RESET max_parallel_workers_per_gather;
+
 -- join with one side empty
 EXPLAIN (COSTS OFF)
 SELECT t1.a, t1.c, t2.b, t2.c FROM (SELECT * FROM prt1_l WHERE a = 1 AND a = 2) t1 RIGHT JOIN prt2_l t2 ON t1.a = t2.b
ANDt1.b = t2.a AND t1.c = t2.c; 

Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Ashutosh Bapat
Дата:
Hi Tom,

On Tue, Aug 22, 2023 at 2:18 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

>
> (BTW, I did look at Ashutosh's idea of merging the
> reparameterize_path_by_child and path_is_reparameterizable_by_child
> functions, but I didn't think that would be an improvement,
> because we'd have to clutter reparameterize_path_by_child with
> a lot of should-I-skip-this-step tests.  Some of that could be
> hidden in the macros, but a lot would not be.  Another issue
> is that I do not think we can change reparameterize_path_by_child's
> API contract in the back branches, because we advertise it as
> something that FDWs and custom scan providers can use.)

Here are two patches
0001 same as your v5 patch
0002 implements what I had in mind - combining the assessment and
reparameterization in a single function.

I don't know whether you had something similar to 0002 when you
considered that approach. Hence implemented it quickly. The names of
the functions and arguments can be changed. But I think this version
will help us keep assessment and actual reparameterization in sync,
very tightly. Most of the conditionals have been pushed into macros,
except for two which need to be outside the macros and actually look
better to me. Let me know what you think of this.

FWIW I ran make check and all the tests pass.

I agree that we can not consider this for backbranches but we are
anyway thinking of disabling reparameterization for tablesample scans
anyway.

--
Best Wishes,
Ashutosh Bapat

Вложения

Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Richard Guo
Дата:

On Tue, Aug 22, 2023 at 10:39 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Richard Guo <guofenglinux@gmail.com> writes:
> I'm wondering if we can instead adjust the 'inner_req_outer' in
> create_nestloop_path before we perform the check to work around this
> issue for the back branches, something like
> +   /*
> +    * Adjust the parameterization information, which refers to the topmost
> +    * parent.
> +    */
> +   inner_req_outer =
> +       adjust_child_relids_multilevel(root, inner_req_outer,
> +                                      outer_path->parent->relids,
> +                                      outer_path->parent->top_parent_relids);

Mmm ... at the very least you'd need to not do that when top_parent_relids
is unset.

Hmm, adjust_child_relids_multilevel would just return the given relids
unchanged when top_parent_relids is unset, so I think it should be safe
to call it even for non-other rel.
 
...  But I think the risk/reward ratio for messing with this in the
back branches is unattractive in any case: to fix a corner case that
apparently nobody uses in the field, we risk breaking any number of
mainstream parameterized-path cases.  I'm content to commit the v5 patch
(or a successor) into HEAD, but at this point I'm not sure I even want
to risk it in v16, let alone perform delicate surgery to get it to work
in older branches.  I think we ought to go with the "tablesample scans
can't be reparameterized" approach in v16 and before.

If we go with the "tablesample scans can't be reparameterized" approach
in the back branches, I'm a little concerned that what if we find more
cases in the futrue where we need modify RTEs for reparameterization.
So I spent some time seeking and have managed to find one: there might
be lateral references in a scan path's restriction clauses, and
currently reparameterize_path_by_child fails to adjust them.

regression=# explain (verbose, costs off)
select count(*) from prt1 t1 left join lateral
    (select t1.b as t1b, t2.* from prt2 t2) s
    on t1.a = s.b where s.t1b = s.a;
                              QUERY PLAN
----------------------------------------------------------------------
 Aggregate
   Output: count(*)
   ->  Append
         ->  Nested Loop
               ->  Seq Scan on public.prt1_p1 t1_1
                     Output: t1_1.a, t1_1.b
               ->  Index Scan using iprt2_p1_b on public.prt2_p1 t2_1
                     Output: t2_1.b
                     Index Cond: (t2_1.b = t1_1.a)
                     Filter: (t1.b = t2_1.a)
         ->  Nested Loop
               ->  Seq Scan on public.prt1_p2 t1_2
                     Output: t1_2.a, t1_2.b
               ->  Index Scan using iprt2_p2_b on public.prt2_p2 t2_2
                     Output: t2_2.b
                     Index Cond: (t2_2.b = t1_2.a)
                     Filter: (t1.b = t2_2.a)
         ->  Nested Loop
               ->  Seq Scan on public.prt1_p3 t1_3
                     Output: t1_3.a, t1_3.b
               ->  Index Scan using iprt2_p3_b on public.prt2_p3 t2_3
                     Output: t2_3.b
                     Index Cond: (t2_3.b = t1_3.a)
                     Filter: (t1.b = t2_3.a)
(24 rows)

The clauses in 'Filter' are not adjusted correctly.  Var 't1.b' still
refers to the top parent.

For this query it would just give wrong results.

regression=# set enable_partitionwise_join to off;
SET
regression=# select count(*) from prt1 t1 left join lateral
    (select t1.b as t1b, t2.* from prt2 t2) s
    on t1.a = s.b where s.t1b = s.a;
 count
-------
   100
(1 row)

regression=# set enable_partitionwise_join to on;
SET
regression=# select count(*) from prt1 t1 left join lateral
    (select t1.b as t1b, t2.* from prt2 t2) s
    on t1.a = s.b where s.t1b = s.a;
 count
-------
     5
(1 row)


For another query it would error out during planning.

regression=# explain (verbose, costs off)
select count(*) from prt1 t1 left join lateral
    (select t1.b as t1b, t2.* from prt2 t2) s
    on t1.a = s.b where s.t1b = s.b;
ERROR:  variable not found in subplan target list

To fix it we may need to modify RelOptInfos for Path, BitmapHeapPath,
ForeignPath and CustomPath, and modify IndexOptInfos for IndexPath.  It
seems that that is not easily done without postponing reparameterization
of paths until createplan.c.

Attached is a patch which is v5 + fix for this new issue.

Thanks
Richard
Вложения

Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Ashutosh Bapat
Дата:
On Wed, Aug 23, 2023 at 11:07 AM Richard Guo <guofenglinux@gmail.com> wrote:
>
>
> On Tue, Aug 22, 2023 at 10:39 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>
>> Richard Guo <guofenglinux@gmail.com> writes:
>> > I'm wondering if we can instead adjust the 'inner_req_outer' in
>> > create_nestloop_path before we perform the check to work around this
>> > issue for the back branches, something like
>> > +   /*
>> > +    * Adjust the parameterization information, which refers to the topmost
>> > +    * parent.
>> > +    */
>> > +   inner_req_outer =
>> > +       adjust_child_relids_multilevel(root, inner_req_outer,
>> > +                                      outer_path->parent->relids,
>> > +                                      outer_path->parent->top_parent_relids);
>>
>> Mmm ... at the very least you'd need to not do that when top_parent_relids
>> is unset.
>
>
> Hmm, adjust_child_relids_multilevel would just return the given relids
> unchanged when top_parent_relids is unset, so I think it should be safe
> to call it even for non-other rel.
>
>>
>> ...  But I think the risk/reward ratio for messing with this in the
>> back branches is unattractive in any case: to fix a corner case that
>> apparently nobody uses in the field, we risk breaking any number of
>> mainstream parameterized-path cases.  I'm content to commit the v5 patch
>> (or a successor) into HEAD, but at this point I'm not sure I even want
>> to risk it in v16, let alone perform delicate surgery to get it to work
>> in older branches.  I think we ought to go with the "tablesample scans
>> can't be reparameterized" approach in v16 and before.
>
>
> If we go with the "tablesample scans can't be reparameterized" approach
> in the back branches, I'm a little concerned that what if we find more
> cases in the futrue where we need modify RTEs for reparameterization.
> So I spent some time seeking and have managed to find one: there might
> be lateral references in a scan path's restriction clauses, and
> currently reparameterize_path_by_child fails to adjust them.
>
> regression=# explain (verbose, costs off)
> select count(*) from prt1 t1 left join lateral
>     (select t1.b as t1b, t2.* from prt2 t2) s
>     on t1.a = s.b where s.t1b = s.a;
>                               QUERY PLAN
> ----------------------------------------------------------------------
>  Aggregate
>    Output: count(*)
>    ->  Append
>          ->  Nested Loop
>                ->  Seq Scan on public.prt1_p1 t1_1
>                      Output: t1_1.a, t1_1.b
>                ->  Index Scan using iprt2_p1_b on public.prt2_p1 t2_1
>                      Output: t2_1.b
>                      Index Cond: (t2_1.b = t1_1.a)
>                      Filter: (t1.b = t2_1.a)
>          ->  Nested Loop
>                ->  Seq Scan on public.prt1_p2 t1_2
>                      Output: t1_2.a, t1_2.b
>                ->  Index Scan using iprt2_p2_b on public.prt2_p2 t2_2
>                      Output: t2_2.b
>                      Index Cond: (t2_2.b = t1_2.a)
>                      Filter: (t1.b = t2_2.a)
>          ->  Nested Loop
>                ->  Seq Scan on public.prt1_p3 t1_3
>                      Output: t1_3.a, t1_3.b
>                ->  Index Scan using iprt2_p3_b on public.prt2_p3 t2_3
>                      Output: t2_3.b
>                      Index Cond: (t2_3.b = t1_3.a)
>                      Filter: (t1.b = t2_3.a)
> (24 rows)
>
> The clauses in 'Filter' are not adjusted correctly.  Var 't1.b' still
> refers to the top parent.
>
> For this query it would just give wrong results.
>
> regression=# set enable_partitionwise_join to off;
> SET
> regression=# select count(*) from prt1 t1 left join lateral
>     (select t1.b as t1b, t2.* from prt2 t2) s
>     on t1.a = s.b where s.t1b = s.a;
>  count
> -------
>    100
> (1 row)
>
> regression=# set enable_partitionwise_join to on;
> SET
> regression=# select count(*) from prt1 t1 left join lateral
>     (select t1.b as t1b, t2.* from prt2 t2) s
>     on t1.a = s.b where s.t1b = s.a;
>  count
> -------
>      5
> (1 row)
>
>
> For another query it would error out during planning.
>
> regression=# explain (verbose, costs off)
> select count(*) from prt1 t1 left join lateral
>     (select t1.b as t1b, t2.* from prt2 t2) s
>     on t1.a = s.b where s.t1b = s.b;
> ERROR:  variable not found in subplan target list
>
> To fix it we may need to modify RelOptInfos for Path, BitmapHeapPath,
> ForeignPath and CustomPath, and modify IndexOptInfos for IndexPath.  It
> seems that that is not easily done without postponing reparameterization
> of paths until createplan.c.

Maybe I am missing something here but why aren't we copying these
restrictinfos in the path somewhere?  I have a similar question for
tablesample clause as well.

--
Best Wishes,
Ashutosh Bapat



Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Tom Lane
Дата:
Richard Guo <guofenglinux@gmail.com> writes:
> If we go with the "tablesample scans can't be reparameterized" approach
> in the back branches, I'm a little concerned that what if we find more
> cases in the futrue where we need modify RTEs for reparameterization.
> So I spent some time seeking and have managed to find one: there might
> be lateral references in a scan path's restriction clauses, and
> currently reparameterize_path_by_child fails to adjust them.

Hmm, this seems completely wrong to me.  By definition, such clauses
ought to be join clauses not restriction clauses, so how are we getting
into this state?  IOW, I agree this is clearly buggy but I think the
bug is someplace else.

            regards, tom lane



Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Richard Guo
Дата:

On Thu, Aug 24, 2023 at 1:44 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Richard Guo <guofenglinux@gmail.com> writes:
> If we go with the "tablesample scans can't be reparameterized" approach
> in the back branches, I'm a little concerned that what if we find more
> cases in the futrue where we need modify RTEs for reparameterization.
> So I spent some time seeking and have managed to find one: there might
> be lateral references in a scan path's restriction clauses, and
> currently reparameterize_path_by_child fails to adjust them.

Hmm, this seems completely wrong to me.  By definition, such clauses
ought to be join clauses not restriction clauses, so how are we getting
into this state?  IOW, I agree this is clearly buggy but I think the
bug is someplace else.

If the clause contains PHVs that syntactically belong to a rel and
meanwhile have lateral references to other rels, then it may become a
restriction clause with lateral references.  Take the query shown
upthread as an example,

select count(*) from prt1 t1 left join lateral
    (select t1.b as t1b, t2.* from prt2 t2) s
    on t1.a = s.b where s.t1b = s.a;

The clause 's.t1b = s.a' would become 'PHV(t1.b) = t2.a' after we have
pulled up the subquery.  The PHV in it syntactically belongs to 't2' and
laterally refers to 't1'.  So this clause is actually a restriction
clause for rel 't2', and will be put into the baserestrictinfo of t2
rel.  But it also has lateral reference to rel 't1', which we need to
adjust in reparameterize_path_by_child for partitionwise join.

Thanks
Richard

Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Ashutosh Bapat
Дата:
On Thu, Aug 24, 2023 at 8:17 AM Richard Guo <guofenglinux@gmail.com> wrote:
>
>
> On Thu, Aug 24, 2023 at 1:44 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>
>> Richard Guo <guofenglinux@gmail.com> writes:
>> > If we go with the "tablesample scans can't be reparameterized" approach
>> > in the back branches, I'm a little concerned that what if we find more
>> > cases in the futrue where we need modify RTEs for reparameterization.
>> > So I spent some time seeking and have managed to find one: there might
>> > be lateral references in a scan path's restriction clauses, and
>> > currently reparameterize_path_by_child fails to adjust them.
>>
>> Hmm, this seems completely wrong to me.  By definition, such clauses
>> ought to be join clauses not restriction clauses, so how are we getting
>> into this state?  IOW, I agree this is clearly buggy but I think the
>> bug is someplace else.
>
>
> If the clause contains PHVs that syntactically belong to a rel and
> meanwhile have lateral references to other rels, then it may become a
> restriction clause with lateral references.  Take the query shown
> upthread as an example,
>
> select count(*) from prt1 t1 left join lateral
>     (select t1.b as t1b, t2.* from prt2 t2) s
>     on t1.a = s.b where s.t1b = s.a;
>
> The clause 's.t1b = s.a' would become 'PHV(t1.b) = t2.a' after we have
> pulled up the subquery.  The PHV in it syntactically belongs to 't2' and
> laterally refers to 't1'.  So this clause is actually a restriction
> clause for rel 't2', and will be put into the baserestrictinfo of t2
> rel.  But it also has lateral reference to rel 't1', which we need to
> adjust in reparameterize_path_by_child for partitionwise join.

When the clause s.t1b = s.a is presented to distribute_qual_to_rels()
it has form PHV(t1.b) = t2.b. The PHV's ph_eval_at is 4, which is what
is returned as varno to pull_varnos(). The other Var in the caluse has
varno = 4 already so  pull_varnos() returns a SINGLETON relids (b 4).
The clause is an equality clause, so it is used to create an
Equivalence class.
generate_base_implied_equalities_no_const() then constructs the same
RestrictInfo again and adds to baserestrictinfo of Rel with relid = 4
i.e. t2's baserestrictinfo. I don't know whether that's the right
thing to do. After the subquery has been pulled up, t1 and t2 can be
joined at the same level and thus the clause makes more sense as  a
joininfo in both t1 as well as t2. So I tend to agree with Tom. That's
how it will move up the join tree and be evaluated at appropriate
level. But then why the query returns the right results is a mystery.

I am not sure where we are taking the original bug fix with this
investigation. Is it required to fix this problem in order to fix the
original problem OR we should commit the fix for the original problem
and then investigate this further?

--
Best Wishes,
Ashutosh Bapat



Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Richard Guo
Дата:

On Fri, Sep 8, 2023 at 3:04 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote:
When the clause s.t1b = s.a is presented to distribute_qual_to_rels()
it has form PHV(t1.b) = t2.b. The PHV's ph_eval_at is 4, which is what
is returned as varno to pull_varnos(). The other Var in the caluse has
varno = 4 already so  pull_varnos() returns a SINGLETON relids (b 4).
The clause is an equality clause, so it is used to create an
Equivalence class.
generate_base_implied_equalities_no_const() then constructs the same
RestrictInfo again and adds to baserestrictinfo of Rel with relid = 4
i.e. t2's baserestrictinfo. I don't know whether that's the right
thing to do. 

Well, I think that's what PHVs are supposed to do.  Quoting the README:

... Note that even with this restriction, pullup of a LATERAL
subquery can result in creating PlaceHolderVars that contain lateral
references to relations outside their syntactic scope.  We still evaluate
such PHVs at their syntactic location or lower, but the presence of such a
PHV in the quals or targetlist of a plan node requires that node to appear
on the inside of a nestloop join relative to the rel(s) supplying the
lateral reference.
 
I am not sure where we are taking the original bug fix with this
investigation. Is it required to fix this problem in order to fix the
original problem OR we should commit the fix for the original problem
and then investigate this further?

Fair point.  This seems a separate problem from the original, so I'm
okay we fix them separately.

Thanks
Richard

Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Andrey Lepikhov
Дата:
On 23/8/2023 12:37, Richard Guo wrote:
> If we go with the "tablesample scans can't be reparameterized" approach
> in the back branches, I'm a little concerned that what if we find more
> cases in the futrue where we need modify RTEs for reparameterization.
> So I spent some time seeking and have managed to find one: there might
> be lateral references in a scan path's restriction clauses, and
> currently reparameterize_path_by_child fails to adjust them.
It may help you somehow: in [1], we designed a feature where the 
partitionwise join technique can be applied to a JOIN of partitioned and 
non-partitioned tables. Unfortunately, it is out of community 
discussions, but we still support it for sharding usage - it is helpful 
for the implementation of 'global' tables in a distributed 
configuration. And there we were stuck into the same problem with 
lateral relids adjustment. So you can build a more general view of the 
problem with this patch.

[1] Asymmetric partition-wise JOIN
https://www.postgresql.org/message-id/flat/CAOP8fzaVL_2SCJayLL9kj5pCA46PJOXXjuei6-3aFUV45j4LJQ%40mail.gmail.com

-- 
regards,
Andrey Lepikhov
Postgres Professional




Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Andrei Lepikhov
Дата:
On 23/8/2023 12:37, Richard Guo wrote:
> To fix it we may need to modify RelOptInfos for Path, BitmapHeapPath,
> ForeignPath and CustomPath, and modify IndexOptInfos for IndexPath.  It
> seems that that is not easily done without postponing reparameterization
> of paths until createplan.c.
> 
> Attached is a patch which is v5 + fix for this new issue.

Having looked into the patch for a while, I couldn't answer to myself 
for some stupid questions:
1. If we postpone parameterization until the plan creation, why do we 
still copy the path node in the FLAT_COPY_PATH macros? Do we really need it?
2. I see big switches on path nodes. May it be time to create a 
path_walker function? I recall some thread where such a suggestion was 
declined, but I don't remember why.
3. Clause changing is postponed. Why does it not influence the 
calc_joinrel_size_estimate? We will use statistics on the parent table 
here. Am I wrong?

-- 
regards,
Andrey Lepikhov
Postgres Professional




Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Richard Guo
Дата:

On Fri, Oct 13, 2023 at 6:18 PM Andrei Lepikhov <a.lepikhov@postgrespro.ru> wrote:
On 23/8/2023 12:37, Richard Guo wrote:
> To fix it we may need to modify RelOptInfos for Path, BitmapHeapPath,
> ForeignPath and CustomPath, and modify IndexOptInfos for IndexPath.  It
> seems that that is not easily done without postponing reparameterization
> of paths until createplan.c.
>
> Attached is a patch which is v5 + fix for this new issue.

Having looked into the patch for a while, I couldn't answer to myself
for some stupid questions:

Thanks for reviewing this patch!  I think these are great questions.
 
1. If we postpone parameterization until the plan creation, why do we
still copy the path node in the FLAT_COPY_PATH macros? Do we really need it?

Good point.  The NestPath's origin inner path should not be referenced
any more after the reparameterization, so it seems safe to adjust the
path itself, without the need of a flat-copy.  I've done that in v8
patch.
 
2. I see big switches on path nodes. May it be time to create a
path_walker function? I recall some thread where such a suggestion was
declined, but I don't remember why.

I'm not sure.  But this seems a separate topic, so maybe it's better to
discuss it separately?
 
3. Clause changing is postponed. Why does it not influence the
calc_joinrel_size_estimate? We will use statistics on the parent table
here. Am I wrong?

Hmm, I think the reparameterization does not change the estimated
size/costs.  Quoting the comment

* The cost, number of rows, width and parallel path properties depend upon
* path->parent, which does not change during the translation.


Hi Tom, I'm wondering if you've had a chance to follow this issue.  What
do you think about the proposed patch?

Thanks
Richard
Вложения

Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Andrei Lepikhov
Дата:
On 18/10/2023 13:39, Richard Guo wrote:
> 
> On Fri, Oct 13, 2023 at 6:18 PM Andrei Lepikhov 
> <a.lepikhov@postgrespro.ru <mailto:a.lepikhov@postgrespro.ru>> wrote:
> 
>     On 23/8/2023 12:37, Richard Guo wrote:
>      > To fix it we may need to modify RelOptInfos for Path, BitmapHeapPath,
>      > ForeignPath and CustomPath, and modify IndexOptInfos for
>     IndexPath.  It
>      > seems that that is not easily done without postponing
>     reparameterization
>      > of paths until createplan.c.
>      >
>      > Attached is a patch which is v5 + fix for this new issue.
> 
>     Having looked into the patch for a while, I couldn't answer to myself
>     for some stupid questions:
> 
> 
> Thanks for reviewing this patch!  I think these are great questions.
> 
>     1. If we postpone parameterization until the plan creation, why do we
>     still copy the path node in the FLAT_COPY_PATH macros? Do we really
>     need it?
> 
> 
> Good point.  The NestPath's origin inner path should not be referenced
> any more after the reparameterization, so it seems safe to adjust the
> path itself, without the need of a flat-copy.  I've done that in v8
> patch.
> 
>     2. I see big switches on path nodes. May it be time to create a
>     path_walker function? I recall some thread where such a suggestion was
>     declined, but I don't remember why.
> 
> 
> I'm not sure.  But this seems a separate topic, so maybe it's better to
> discuss it separately?

Agree.

>     3. Clause changing is postponed. Why does it not influence the
>     calc_joinrel_size_estimate? We will use statistics on the parent table
>     here. Am I wrong?
> 
> 
> Hmm, I think the reparameterization does not change the estimated
> size/costs.  Quoting the comment

Agree. I have looked at the code and figured it out - you're right. But 
it seems strange: maybe I don't understand something. Why not estimate 
selectivity for parameterized clauses based on leaf partition statistic, 
not the parent one?

-- 
regards,
Andrey Lepikhov
Postgres Professional




Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Alena Rybakina
Дата:

Hi!

Thank you for your work on the subject.


During review your patch I didn't understand why are you checking that the variable is path and not new_path of type T_SamplePath (I highlighted it)?
Path *
reparameterize_path_by_child(PlannerInfo *root, Path *path,
                             RelOptInfo *child_rel)

...
switch (nodeTag(path))
    {
        case T_Path:
            new_path = path;
            ADJUST_CHILD_ATTRS(new_path->parent->baserestrictinfo);
            if (path->pathtype == T_SampleScan)
            {

Is it a typo and should be new_path?


Besides, it may not be important, but reviewing your code all the time stumbled on the statement of the comments while reading it (I highlighted it too). This:

* path_is_reparameterizable_by_child
 *         Given a path parameterized by the parent of the given child relation,
 *         see if it can be translated to be parameterized by the child relation.
 *
 * This must return true if and only if reparameterize_path_by_child()
 * would succeed on this path.

Maybe is it better to rewrite it simplier:

 * This must return true only if reparameterize_path_by_child()
 * would succeed on this path.


And can we add assert in reparameterize_pathlist_by_child function that pathlist is not a NIL, because according to the comment it needs to be added there:Returns NIL to indicate failure, so pathlist had better not be NIL.

-- 
Regards,
Alena Rybakina

Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Richard Guo
Дата:

On Fri, Oct 20, 2023 at 2:52 AM Alena Rybakina <lena.ribackina@yandex.ru> wrote:

Thank you for your work on the subject.

Thanks for taking an interest in this patch. 
 

During review your patch I didn't understand why are you checking that the variable is path and not new_path of type T_SamplePath (I highlighted it)?

Is it a typo and should be new_path?

I don't think there is any difference: path and new_path are the same
pointer in this case.
 

Besides, it may not be important, but reviewing your code all the time stumbled on the statement of the comments while reading it (I highlighted it too). This:

* path_is_reparameterizable_by_child
 *         Given a path parameterized by the parent of the given child relation,
 *         see if it can be translated to be parameterized by the child relation.
 *
 * This must return true if and only if reparameterize_path_by_child()
 * would succeed on this path.

Maybe is it better to rewrite it simplier:

 * This must return true only if reparameterize_path_by_child()
 * would succeed on this path.

I don't think so.  "if and only if" is more accurate to me.
 

And can we add assert in reparameterize_pathlist_by_child function that pathlist is not a NIL, because according to the comment it needs to be added there:

Hmm, I'm not sure, as in REPARAMETERIZE_CHILD_PATH_LIST we have already
explicitly checked that the pathlist is not NIL.

Thanks
Richard 

Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Richard Guo
Дата:
I've self-reviewed this patch again and I think it's now in a
committable state.  I'm wondering if we can mark it as 'Ready for
Committer' now, or we need more review comments/feedbacks.

To recap, this patch postpones reparameterization of paths until
createplan.c, which would help avoid building the reparameterized
expressions we might not use.  More importantly, it makes it possible to
modify the expressions in RTEs (e.g. tablesample) and in RelOptInfos
(e.g. baserestrictinfo) for reparameterization.  Failing to do that can
lead to executor crashes, wrong results, or planning errors, as we have
already seen.

Thanks
Richard

Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Alena Rybakina
Дата:
On 06.12.2023 10:30, Richard Guo wrote:
> I've self-reviewed this patch again and I think it's now in a
> committable state.  I'm wondering if we can mark it as 'Ready for
> Committer' now, or we need more review comments/feedbacks.
>
> To recap, this patch postpones reparameterization of paths until
> createplan.c, which would help avoid building the reparameterized
> expressions we might not use.  More importantly, it makes it possible to
> modify the expressions in RTEs (e.g. tablesample) and in RelOptInfos
> (e.g. baserestrictinfo) for reparameterization.  Failing to do that can
> lead to executor crashes, wrong results, or planning errors, as we have
> already seen.
I marked it as 'Ready for Committer'. I think it is ready.

-- 
Regards,
Alena Rybakina
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Richard Guo
Дата:

On Fri, Dec 8, 2023 at 5:39 PM Alena Rybakina <lena.ribackina@yandex.ru> wrote:
On 06.12.2023 10:30, Richard Guo wrote:
> I've self-reviewed this patch again and I think it's now in a
> committable state.  I'm wondering if we can mark it as 'Ready for
> Committer' now, or we need more review comments/feedbacks.
>
> To recap, this patch postpones reparameterization of paths until
> createplan.c, which would help avoid building the reparameterized
> expressions we might not use.  More importantly, it makes it possible to
> modify the expressions in RTEs (e.g. tablesample) and in RelOptInfos
> (e.g. baserestrictinfo) for reparameterization.  Failing to do that can
> lead to executor crashes, wrong results, or planning errors, as we have
> already seen. 
I marked it as 'Ready for Committer'. I think it is ready.

Thank you.  Appreciate that.

Thanks
Richard

Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Andrei Lepikhov
Дата:
On 6/12/2023 14:30, Richard Guo wrote:
> I've self-reviewed this patch again and I think it's now in a
> committable state.  I'm wondering if we can mark it as 'Ready for
> Committer' now, or we need more review comments/feedbacks.
> 
> To recap, this patch postpones reparameterization of paths until
> createplan.c, which would help avoid building the reparameterized
> expressions we might not use.  More importantly, it makes it possible to
> modify the expressions in RTEs (e.g. tablesample) and in RelOptInfos
> (e.g. baserestrictinfo) for reparameterization.  Failing to do that can
> lead to executor crashes, wrong results, or planning errors, as we have
> already seen.

As I see, this patch contains the following modifications:
1. Changes in the create_nestloop_path: here, it arranges all tests to 
the value of top_parent_relids, if any. It is ok for me.
2. Changes in reparameterize_path_by_child and coupled new function 
path_is_reparameterizable_by_child. I compared them, and it looks good.
One thing here. You use the assertion trap in the case of inconsistency 
in the behaviour of these two functions. How disastrous would it be if 
someone found such inconsistency in production? Maybe better to use 
elog(PANIC, ...) report?
3. ADJUST_CHILD_ATTRS() macros. Understanding the necessity of this 
change, I want to say it looks a bit ugly.
4. SampleScan reparameterization part. It looks ok. It can help us in 
the future with asymmetric join and something else.
5. Tests. All of them are related to partitioning and the SampleScan 
issue. Maybe it is enough, but remember, this change now impacts the 
parameterization feature in general.
6. I can't judge how this switch of overall approach could impact 
something in the planner. I am only wary about what, from the first 
reparameterization up to the plan creation, we have some inconsistency 
in expressions (partitions refer to expressions with the parent relid). 
What if an extension in the middle changes the parent expression?

By and large, this patch is in a good state and may be committed.

-- 
regards,
Andrei Lepikhov
Postgres Professional




Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Robert Haas
Дата:
On Tue, Dec 12, 2023 at 9:55 PM Andrei Lepikhov
<a.lepikhov@postgrespro.ru> wrote:
> By and large, this patch is in a good state and may be committed.

OK, so a few people like the current form of this patch but we haven't
heard from Tom since August. Tom, any thoughts on the current
incarnation?

Richard, I think it could be useful to put a better commit message
into the patch file, describing both what problem is being fixed and
what the design of the fix is. I gather that the problem is that we
crash if the query contains a partioningwise join and also $SOMETHING,
and the solution is to move reparameterization to happen at
createplan() time but with a precheck that runs during path
generation. Presumably, that means this is more than a minimal bug
fix, because the bug could be fixed without splitting
can-it-be-reparameterized to reparameterize-it in this way. Probably
that's a performance optimization, so maybe it's worth clarifying
whether that's just an independently good idea or whether it's a part
of making the bug fix not regress performance.

I think the macro names in path_is_reparameterizable_by_child could be
better chosen. CHILD_PATH_IS_REPARAMETERIZABLE doesn't convey that the
macro will return from the calling function if not -- it looks like it
just returns a Boolean. Maybe REJECT_IF_PATH_NOT_REPARAMETERIZABLE and
REJECT_IF_PATH_LIST_NOT_REPARAMETERIZABLE or some such.

Another question here is whether we really want to back-patch all of
this or whether it might be better to, as Tom proposed previously,
back-patch a more minimal fix and leave the more invasive stuff for
master.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> OK, so a few people like the current form of this patch but we haven't
> heard from Tom since August. Tom, any thoughts on the current
> incarnation?

Not yet, but it is on my to-look-at list.  In the meantime I concur
with your comments here.

            regards, tom lane



Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Richard Guo
Дата:
Thanks for the review!

On Sat, Jan 6, 2024 at 2:36 AM Robert Haas <robertmhaas@gmail.com> wrote:
Richard, I think it could be useful to put a better commit message
into the patch file, describing both what problem is being fixed and
what the design of the fix is. I gather that the problem is that we
crash if the query contains a partioningwise join and also $SOMETHING,
and the solution is to move reparameterization to happen at
createplan() time but with a precheck that runs during path
generation. Presumably, that means this is more than a minimal bug
fix, because the bug could be fixed without splitting
can-it-be-reparameterized to reparameterize-it in this way. Probably
that's a performance optimization, so maybe it's worth clarifying
whether that's just an independently good idea or whether it's a part
of making the bug fix not regress performance.

Thanks for the suggestion.  Attached is an updated patch which is added
with a commit message that tries to explain the problem and the fix.

I think the macro names in path_is_reparameterizable_by_child could be
better chosen. CHILD_PATH_IS_REPARAMETERIZABLE doesn't convey that the
macro will return from the calling function if not -- it looks like it
just returns a Boolean. Maybe REJECT_IF_PATH_NOT_REPARAMETERIZABLE and
REJECT_IF_PATH_LIST_NOT_REPARAMETERIZABLE or some such.

Agreed.
 
Another question here is whether we really want to back-patch all of
this or whether it might be better to, as Tom proposed previously,
back-patch a more minimal fix and leave the more invasive stuff for
master.

Fair point.  I think we can back-patch a more minimal fix, as Tom
proposed in [1], which disallows the reparameterization if the path
contains sampling info that references the outer rel.  But I think we
need also to disallow the reparameterization of a path if it contains
restriction clauses that reference the outer rel, as such paths have
been found to cause incorrect results, or planning errors as in [2].

[1] https://www.postgresql.org/message-id/3163033.1692719009%40sss.pgh.pa.us
[2] https://www.postgresql.org/message-id/CAMbWs4-CSR4VnZCDep3ReSoHGTA7E%2B3tnjF_LmHcX7yiGrkVfQ%40mail.gmail.com

Thanks
Richard
Вложения

Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Robert Haas
Дата:
On Mon, Jan 8, 2024 at 3:32 AM Richard Guo <guofenglinux@gmail.com> wrote:
> Thanks for the suggestion.  Attached is an updated patch which is added
> with a commit message that tries to explain the problem and the fix.

This is great. The references to "the sampling infos" are a little bit
confusing to me. I'm not sure if this is referring to
TableSampleClause objects or what.

> Fair point.  I think we can back-patch a more minimal fix, as Tom
> proposed in [1], which disallows the reparameterization if the path
> contains sampling info that references the outer rel.  But I think we
> need also to disallow the reparameterization of a path if it contains
> restriction clauses that reference the outer rel, as such paths have
> been found to cause incorrect results, or planning errors as in [2].

Do you want to produce a patch for that, to go along with this patch for master?

I know this is on Tom's to-do list which makes me a bit reluctant to
get too involved here, because certainly he knows this code better
than I do, maybe better than anyone does, but on the other hand, we
shouldn't leave server crashes unfixed for too long, so maybe I can do
something to help at least with that part of it.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> I know this is on Tom's to-do list which makes me a bit reluctant to
> get too involved here, because certainly he knows this code better
> than I do, maybe better than anyone does, but on the other hand, we
> shouldn't leave server crashes unfixed for too long, so maybe I can do
> something to help at least with that part of it.

This is indeed on my to-do list, and I have every intention of
getting to it before the end of the month.  But if you want to
help push things along, feel free.

            regards, tom lane



Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Richard Guo
Дата:

On Tue, Jan 16, 2024 at 2:30 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Jan 8, 2024 at 3:32 AM Richard Guo <guofenglinux@gmail.com> wrote:
> Thanks for the suggestion.  Attached is an updated patch which is added
> with a commit message that tries to explain the problem and the fix.

This is great. The references to "the sampling infos" are a little bit
confusing to me. I'm not sure if this is referring to
TableSampleClause objects or what.

Yeah, it's referring to TableSampleClause objects.  I've updated the
commit message to clarify that.  In passing I also updated the test
cases a bit.  Please see
v10-0001-Postpone-reparameterization-of-paths-until-when-creating-plans.patch
 
> Fair point.  I think we can back-patch a more minimal fix, as Tom
> proposed in [1], which disallows the reparameterization if the path
> contains sampling info that references the outer rel.  But I think we
> need also to disallow the reparameterization of a path if it contains
> restriction clauses that reference the outer rel, as such paths have
> been found to cause incorrect results, or planning errors as in [2].

Do you want to produce a patch for that, to go along with this patch for master?

Sure, here it is:
v10-0001-Avoid-reparameterizing-Paths-when-it-s-not-suitable.patch

Thanks
Richard
Вложения

Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Richard Guo
Дата:

On Wed, Jan 17, 2024 at 5:01 PM Richard Guo <guofenglinux@gmail.com> wrote:
On Tue, Jan 16, 2024 at 2:30 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Jan 8, 2024 at 3:32 AM Richard Guo <guofenglinux@gmail.com> wrote:
> Fair point.  I think we can back-patch a more minimal fix, as Tom
> proposed in [1], which disallows the reparameterization if the path
> contains sampling info that references the outer rel.  But I think we
> need also to disallow the reparameterization of a path if it contains
> restriction clauses that reference the outer rel, as such paths have
> been found to cause incorrect results, or planning errors as in [2].

Do you want to produce a patch for that, to go along with this patch for master?

Sure, here it is:
v10-0001-Avoid-reparameterizing-Paths-when-it-s-not-suitable.patch

I forgot to mention that this patch applies on v16 not master.

Thanks
Richard

Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Tom Lane
Дата:
Richard Guo <guofenglinux@gmail.com> writes:
> On Wed, Jan 17, 2024 at 5:01 PM Richard Guo <guofenglinux@gmail.com> wrote:
>> Sure, here it is:
>> v10-0001-Avoid-reparameterizing-Paths-when-it-s-not-suitable.patch

> I forgot to mention that this patch applies on v16 not master.

I spent some time looking at this patch (which seems more urgent than
the patch for master, given that we have back-branch releases coming
up).  There are two things I'm not persuaded about:

* Why is it okay to just use pull_varnos on a tablesample expression,
when contain_references_to() does something different?

* Is contain_references_to() doing the right thing by specifying
PVC_RECURSE_PLACEHOLDERS?  That causes it to totally ignore a
PlaceHolderVar's ph_eval_at, and I'm not convinced that's correct.

Ideally it seems to me that we want to reject a PlaceHolderVar
if either its ph_eval_at or ph_lateral overlap the other join
relation; if it was coded that way then we'd not need to recurse
into the PHV's contents.   pull_varnos isn't directly amenable
to this, but I think we could use pull_var_clause with
PVC_INCLUDE_PLACEHOLDERS and then iterate through the resulting
list manually.  (If this patch were meant for HEAD, I'd think
about extending the var.c code to support this usage more directly.
But as things stand, this is a one-off so I think we should just do
what we must in reparameterize_path_by_child.)

BTW, it shouldn't be necessary to write either PVC_RECURSE_AGGREGATES
or PVC_RECURSE_WINDOWFUNCS, because neither kind of node should ever
appear in a scan-level expression.  I'd leave those out so that we
get an error if something unexpected happens.

            regards, tom lane



Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Richard Guo
Дата:

On Wed, Jan 31, 2024 at 5:12 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Richard Guo <guofenglinux@gmail.com> writes:
> On Wed, Jan 17, 2024 at 5:01 PM Richard Guo <guofenglinux@gmail.com> wrote:
>> Sure, here it is:
>> v10-0001-Avoid-reparameterizing-Paths-when-it-s-not-suitable.patch

> I forgot to mention that this patch applies on v16 not master.

I spent some time looking at this patch (which seems more urgent than
the patch for master, given that we have back-branch releases coming
up). 

Thanks for looking at this patch!
 
There are two things I'm not persuaded about:

* Why is it okay to just use pull_varnos on a tablesample expression,
when contain_references_to() does something different?

Hmm, the main reason is that the expression_tree_walker() function does
not handle T_RestrictInfo nodes.  So we cannot just use pull_varnos or
pull_var_clause on a restrictinfo list.  So contain_references_to()
calls extract_actual_clauses() first before it calls pull_var_clause().
 
* Is contain_references_to() doing the right thing by specifying
PVC_RECURSE_PLACEHOLDERS?  That causes it to totally ignore a
PlaceHolderVar's ph_eval_at, and I'm not convinced that's correct.

I was thinking that we should recurse into the PHV's contents to see if
there are any lateral references to the other join relation.  But now I
realize that checking phinfo->ph_lateral, as you suggested, might be a
better way to do that.

But I'm not sure about checking phinfo->ph_eval_at.  It seems to me that
the ph_eval_at could not overlap the other join relation; otherwise the
clause would not be a restriction clause but rather a join clause.
 
Ideally it seems to me that we want to reject a PlaceHolderVar
if either its ph_eval_at or ph_lateral overlap the other join
relation; if it was coded that way then we'd not need to recurse
into the PHV's contents.   pull_varnos isn't directly amenable
to this, but I think we could use pull_var_clause with
PVC_INCLUDE_PLACEHOLDERS and then iterate through the resulting
list manually.  (If this patch were meant for HEAD, I'd think
about extending the var.c code to support this usage more directly.
But as things stand, this is a one-off so I think we should just do
what we must in reparameterize_path_by_child.)

Thanks for the suggestion.  I've coded it this way in the v11 patch, and
left a XXX comment about checking phinfo->ph_eval_at.
 
BTW, it shouldn't be necessary to write either PVC_RECURSE_AGGREGATES
or PVC_RECURSE_WINDOWFUNCS, because neither kind of node should ever
appear in a scan-level expression.  I'd leave those out so that we
get an error if something unexpected happens.

Good point.

Thanks
Richard
Вложения

Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Tom Lane
Дата:
Richard Guo <guofenglinux@gmail.com> writes:
> On Wed, Jan 31, 2024 at 5:12 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> * Why is it okay to just use pull_varnos on a tablesample expression,
>> when contain_references_to() does something different?

> Hmm, the main reason is that the expression_tree_walker() function does
> not handle T_RestrictInfo nodes.  So we cannot just use pull_varnos or
> pull_var_clause on a restrictinfo list.

Right, the extract_actual_clauses step is not wanted for the
tablesample expression.  But my point is that surely the handling of
Vars and PlaceHolderVars should be the same, which it's not, and your
v11 makes it even less so.  How can it be OK to ignore Vars in the
restrictinfo case?

I think the code structure we need to end up with is a routine that
takes a RestrictInfo-free node tree (and is called directly in the
tablesample case) with a wrapper routine that strips the RestrictInfos
(for use on restriction lists).

> But I'm not sure about checking phinfo->ph_eval_at.  It seems to me that
> the ph_eval_at could not overlap the other join relation; otherwise the
> clause would not be a restriction clause but rather a join clause.

At least in the tablesample case, plain Vars as well as PHVs belonging
to the other relation are definitely possible.

            regards, tom lane



Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Richard Guo
Дата:

On Wed, Jan 31, 2024 at 11:21 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Richard Guo <guofenglinux@gmail.com> writes:
> On Wed, Jan 31, 2024 at 5:12 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> * Why is it okay to just use pull_varnos on a tablesample expression,
>> when contain_references_to() does something different?

> Hmm, the main reason is that the expression_tree_walker() function does
> not handle T_RestrictInfo nodes.  So we cannot just use pull_varnos or
> pull_var_clause on a restrictinfo list.

Right, the extract_actual_clauses step is not wanted for the
tablesample expression.  But my point is that surely the handling of
Vars and PlaceHolderVars should be the same, which it's not, and your
v11 makes it even less so.  How can it be OK to ignore Vars in the
restrictinfo case?

Hmm, in this specific scenario it seems that it's not possible to have
Vars in the restrictinfo list that laterally reference the outer join
relation; otherwise the clause containing such Vars would not be a
restriction clause but rather a join clause.  So in v11 I did not check
Vars in contain_references_to().

But I agree that we'd better to handle Vars and PlaceHolderVars in the
same way.
 
I think the code structure we need to end up with is a routine that
takes a RestrictInfo-free node tree (and is called directly in the
tablesample case) with a wrapper routine that strips the RestrictInfos
(for use on restriction lists).

How about the attached v12 patch?  But I'm a little concerned about
omitting PVC_RECURSE_AGGREGATES and PVC_RECURSE_WINDOWFUNCS in this
case.  Is it possible that aggregates or window functions appear in the
tablesample expression?

Thanks
Richard
Вложения

Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Tom Lane
Дата:
Richard Guo <guofenglinux@gmail.com> writes:
> How about the attached v12 patch?  But I'm a little concerned about
> omitting PVC_RECURSE_AGGREGATES and PVC_RECURSE_WINDOWFUNCS in this
> case.  Is it possible that aggregates or window functions appear in the
> tablesample expression?

No, they can't appear there; it'd make no sense to allow such things
at the relation scan level, and we don't.

regression=# select * from float8_tbl tablesample system(count(*));
ERROR:  aggregate functions are not allowed in functions in FROM
LINE 1: select * from float8_tbl tablesample system(count(*));
                                                    ^
regression=# select * from float8_tbl tablesample system(count(*) over ());
ERROR:  window functions are not allowed in functions in FROM
LINE 1: select * from float8_tbl tablesample system(count(*) over ()...
                                                    ^

I pushed v12 (with some cosmetic adjustments) into the back branches,
since we're getting close to the February release freeze.  We still
need to deal with the larger fix for HEAD.  Please re-post that
one so that the cfbot knows which is the patch-of-record.

            regards, tom lane



Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Richard Guo
Дата:

On Fri, Feb 2, 2024 at 1:45 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
I pushed v12 (with some cosmetic adjustments) into the back branches,
since we're getting close to the February release freeze.  We still
need to deal with the larger fix for HEAD.  Please re-post that
one so that the cfbot knows which is the patch-of-record.

Thanks for the adjustments and pushing!

Here is the patch for HEAD.  I simply re-posted v10.  Nothing has
changed.

Thanks
Richard
Вложения

Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Tom Lane
Дата:
Richard Guo <guofenglinux@gmail.com> writes:
> Here is the patch for HEAD.  I simply re-posted v10.  Nothing has
> changed.

I got back to this finally, and pushed it with some minor cosmetic
adjustments.

            regards, tom lane



Re: Oversight in reparameterize_path_by_child leading to executor crash

От
Richard Guo
Дата:

On Wed, Mar 20, 2024 at 2:57 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Richard Guo <guofenglinux@gmail.com> writes:
> Here is the patch for HEAD.  I simply re-posted v10.  Nothing has
> changed.

I got back to this finally, and pushed it with some minor cosmetic
adjustments.

Thanks for pushing!

Thanks
Richard