Обсуждение: path->param_info only set for lateral?
Hello, Over in "Parallelize correlated subqueries that execute within each worker" [1} Richard Guo found a bug in the current version of my patch in that thread. While debugging that issue I've been wondering why Path's param_info field seems to be NULL unless there is a LATERAL reference even though there may be non-lateral outer params referenced. Consider the query: select * from pg_description t1 where objoid in (select objoid from pg_description t2 where t2.description = t1.description); The subquery's rel has a baserestrictinfo containing an OpExpr comparing a Var (t2.description) to a Param of type PARAM_EXEC (t1.description). But the generated SeqScan path doesn't have its param_info field set, which means PATH_REQ_OUTER returns NULL also despite there being an obvious param referencing a required outer relid. Looking at create_seqscan_path we see that param_info is initialized with: get_baserel_parampathinfo(root, rel, required_outer) where required_outer is passed in from set_plain_rel_pathlist as rel->lateral_relids. And get_baserel_parampathinfo always returns NULL if required_outer is empty, so obviously with this query (no lateral reference) we're not going to get any ParamPathInfo added to the path or the rel. Is there a reason why we don't track the required relids providing the PARAM_EXEC params in this case? Thanks, James Coleman 1: https://www.postgresql.org/message-id/CAMbWs4_evjcMzN8Gw78bHfhfo2FKJThqhEjRJRmoMZx%3DNXcJ7w%40mail.gmail.com
James Coleman <jtc331@gmail.com> writes: > Over in "Parallelize correlated subqueries that execute within each > worker" [1} Richard Guo found a bug in the current version of my patch > in that thread. While debugging that issue I've been wondering why > Path's param_info field seems to be NULL unless there is a LATERAL > reference even though there may be non-lateral outer params > referenced. Per pathnodes.h: * "param_info", if not NULL, links to a ParamPathInfo that identifies outer * relation(s) that provide parameter values to each scan of this path. * That means this path can only be joined to those rels by means of nestloop * joins with this path on the inside. ... We're only interested in this for params that are coming from other relations of the same query level, so that they affect join order and join algorithm choices. Params coming down from outer query levels are much like EXTERN params to the planner: they are pseudoconstants for any one execution of the current query level. This isn't just LATERAL stuff; it's also intentionally-generated nestloop-with-inner-indexscan-cases. But it's not outer-level Params. Even though those are also PARAM_EXEC Params, they are fundamentally different animals for the planner's purposes. regards, tom lane
On Sun, Jun 18, 2023 at 10:57 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > James Coleman <jtc331@gmail.com> writes: > > Over in "Parallelize correlated subqueries that execute within each > > worker" [1} Richard Guo found a bug in the current version of my patch > > in that thread. While debugging that issue I've been wondering why > > Path's param_info field seems to be NULL unless there is a LATERAL > > reference even though there may be non-lateral outer params > > referenced. > > Per pathnodes.h: > > * "param_info", if not NULL, links to a ParamPathInfo that identifies outer > * relation(s) that provide parameter values to each scan of this path. > * That means this path can only be joined to those rels by means of nestloop > * joins with this path on the inside. ... > > We're only interested in this for params that are coming from other > relations of the same query level, so that they affect join order and > join algorithm choices. Params coming down from outer query levels > are much like EXTERN params to the planner: they are pseudoconstants > for any one execution of the current query level. > > This isn't just LATERAL stuff; it's also intentionally-generated > nestloop-with-inner-indexscan-cases. But it's not outer-level Params. > Even though those are also PARAM_EXEC Params, they are fundamentally > different animals for the planner's purposes. Thanks for the explanation. I wonder if it'd be worth clarifying the comment slightly to hint in that direction (like the attached)? Thanks, James Coleman