Обсуждение: path->param_info only set for lateral?

Поиск
Список
Период
Сортировка

path->param_info only set for lateral?

От
James Coleman
Дата:
Hello,

Over in "Parallelize correlated subqueries that execute within each
worker" [1} Richard Guo found a bug in the current version of my patch
in that thread. While debugging that issue I've been wondering why
Path's param_info field seems to be NULL unless there is a LATERAL
reference even though there may be non-lateral outer params
referenced.

Consider the query:
select * from pg_description t1 where objoid in
    (select objoid from pg_description t2 where t2.description =
t1.description);

The subquery's rel has a baserestrictinfo containing an OpExpr
comparing a Var (t2.description) to a Param of type PARAM_EXEC
(t1.description). But the generated SeqScan path doesn't have its
param_info field set, which means PATH_REQ_OUTER returns NULL also
despite there being an obvious param referencing a required outer
relid. Looking at create_seqscan_path we see that param_info is
initialized with:

get_baserel_parampathinfo(root, rel, required_outer)

where required_outer is passed in from set_plain_rel_pathlist as
rel->lateral_relids. And get_baserel_parampathinfo always returns NULL
if required_outer is empty, so obviously with this query (no lateral
reference) we're not going to get any ParamPathInfo added to the path
or the rel.

Is there a reason why we don't track the required relids providing the
PARAM_EXEC params in this case?

Thanks,
James Coleman

1: https://www.postgresql.org/message-id/CAMbWs4_evjcMzN8Gw78bHfhfo2FKJThqhEjRJRmoMZx%3DNXcJ7w%40mail.gmail.com



Re: path->param_info only set for lateral?

От
Tom Lane
Дата:
James Coleman <jtc331@gmail.com> writes:
> Over in "Parallelize correlated subqueries that execute within each
> worker" [1} Richard Guo found a bug in the current version of my patch
> in that thread. While debugging that issue I've been wondering why
> Path's param_info field seems to be NULL unless there is a LATERAL
> reference even though there may be non-lateral outer params
> referenced.

Per pathnodes.h:

 * "param_info", if not NULL, links to a ParamPathInfo that identifies outer
 * relation(s) that provide parameter values to each scan of this path.
 * That means this path can only be joined to those rels by means of nestloop
 * joins with this path on the inside.  ...

We're only interested in this for params that are coming from other
relations of the same query level, so that they affect join order and
join algorithm choices.  Params coming down from outer query levels
are much like EXTERN params to the planner: they are pseudoconstants
for any one execution of the current query level.

This isn't just LATERAL stuff; it's also intentionally-generated
nestloop-with-inner-indexscan-cases.  But it's not outer-level Params.
Even though those are also PARAM_EXEC Params, they are fundamentally
different animals for the planner's purposes.

            regards, tom lane



Re: path->param_info only set for lateral?

От
James Coleman
Дата:
On Sun, Jun 18, 2023 at 10:57 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> James Coleman <jtc331@gmail.com> writes:
> > Over in "Parallelize correlated subqueries that execute within each
> > worker" [1} Richard Guo found a bug in the current version of my patch
> > in that thread. While debugging that issue I've been wondering why
> > Path's param_info field seems to be NULL unless there is a LATERAL
> > reference even though there may be non-lateral outer params
> > referenced.
>
> Per pathnodes.h:
>
>  * "param_info", if not NULL, links to a ParamPathInfo that identifies outer
>  * relation(s) that provide parameter values to each scan of this path.
>  * That means this path can only be joined to those rels by means of nestloop
>  * joins with this path on the inside.  ...
>
> We're only interested in this for params that are coming from other
> relations of the same query level, so that they affect join order and
> join algorithm choices.  Params coming down from outer query levels
> are much like EXTERN params to the planner: they are pseudoconstants
> for any one execution of the current query level.
>
> This isn't just LATERAL stuff; it's also intentionally-generated
> nestloop-with-inner-indexscan-cases.  But it's not outer-level Params.
> Even though those are also PARAM_EXEC Params, they are fundamentally
> different animals for the planner's purposes.

Thanks for the explanation.

I wonder if it'd be worth clarifying the comment slightly to hint in
that direction (like the attached)?

Thanks,
James Coleman

Вложения