Re: Oversight in reparameterize_path_by_child leading to executor crash

Поиск
Список
Период
Сортировка
От Richard Guo
Тема Re: Oversight in reparameterize_path_by_child leading to executor crash
Дата
Msg-id CAMbWs4-CSR4VnZCDep3ReSoHGTA7E+3tnjF_LmHcX7yiGrkVfQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Oversight in reparameterize_path_by_child leading to executor crash  (Tom Lane <tgl@sss.pgh.pa.us>)
Ответы Re: Oversight in reparameterize_path_by_child leading to executor crash  (Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>)
Re: Oversight in reparameterize_path_by_child leading to executor crash  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Oversight in reparameterize_path_by_child leading to executor crash  (Andrey Lepikhov <a.lepikhov@postgrespro.ru>)
Re: Oversight in reparameterize_path_by_child leading to executor crash  (Andrei Lepikhov <a.lepikhov@postgrespro.ru>)
Список pgsql-hackers

On Tue, Aug 22, 2023 at 10:39 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Richard Guo <guofenglinux@gmail.com> writes:
> I'm wondering if we can instead adjust the 'inner_req_outer' in
> create_nestloop_path before we perform the check to work around this
> issue for the back branches, something like
> +   /*
> +    * Adjust the parameterization information, which refers to the topmost
> +    * parent.
> +    */
> +   inner_req_outer =
> +       adjust_child_relids_multilevel(root, inner_req_outer,
> +                                      outer_path->parent->relids,
> +                                      outer_path->parent->top_parent_relids);

Mmm ... at the very least you'd need to not do that when top_parent_relids
is unset.

Hmm, adjust_child_relids_multilevel would just return the given relids
unchanged when top_parent_relids is unset, so I think it should be safe
to call it even for non-other rel.
 
...  But I think the risk/reward ratio for messing with this in the
back branches is unattractive in any case: to fix a corner case that
apparently nobody uses in the field, we risk breaking any number of
mainstream parameterized-path cases.  I'm content to commit the v5 patch
(or a successor) into HEAD, but at this point I'm not sure I even want
to risk it in v16, let alone perform delicate surgery to get it to work
in older branches.  I think we ought to go with the "tablesample scans
can't be reparameterized" approach in v16 and before.

If we go with the "tablesample scans can't be reparameterized" approach
in the back branches, I'm a little concerned that what if we find more
cases in the futrue where we need modify RTEs for reparameterization.
So I spent some time seeking and have managed to find one: there might
be lateral references in a scan path's restriction clauses, and
currently reparameterize_path_by_child fails to adjust them.

regression=# explain (verbose, costs off)
select count(*) from prt1 t1 left join lateral
    (select t1.b as t1b, t2.* from prt2 t2) s
    on t1.a = s.b where s.t1b = s.a;
                              QUERY PLAN
----------------------------------------------------------------------
 Aggregate
   Output: count(*)
   ->  Append
         ->  Nested Loop
               ->  Seq Scan on public.prt1_p1 t1_1
                     Output: t1_1.a, t1_1.b
               ->  Index Scan using iprt2_p1_b on public.prt2_p1 t2_1
                     Output: t2_1.b
                     Index Cond: (t2_1.b = t1_1.a)
                     Filter: (t1.b = t2_1.a)
         ->  Nested Loop
               ->  Seq Scan on public.prt1_p2 t1_2
                     Output: t1_2.a, t1_2.b
               ->  Index Scan using iprt2_p2_b on public.prt2_p2 t2_2
                     Output: t2_2.b
                     Index Cond: (t2_2.b = t1_2.a)
                     Filter: (t1.b = t2_2.a)
         ->  Nested Loop
               ->  Seq Scan on public.prt1_p3 t1_3
                     Output: t1_3.a, t1_3.b
               ->  Index Scan using iprt2_p3_b on public.prt2_p3 t2_3
                     Output: t2_3.b
                     Index Cond: (t2_3.b = t1_3.a)
                     Filter: (t1.b = t2_3.a)
(24 rows)

The clauses in 'Filter' are not adjusted correctly.  Var 't1.b' still
refers to the top parent.

For this query it would just give wrong results.

regression=# set enable_partitionwise_join to off;
SET
regression=# select count(*) from prt1 t1 left join lateral
    (select t1.b as t1b, t2.* from prt2 t2) s
    on t1.a = s.b where s.t1b = s.a;
 count
-------
   100
(1 row)

regression=# set enable_partitionwise_join to on;
SET
regression=# select count(*) from prt1 t1 left join lateral
    (select t1.b as t1b, t2.* from prt2 t2) s
    on t1.a = s.b where s.t1b = s.a;
 count
-------
     5
(1 row)


For another query it would error out during planning.

regression=# explain (verbose, costs off)
select count(*) from prt1 t1 left join lateral
    (select t1.b as t1b, t2.* from prt2 t2) s
    on t1.a = s.b where s.t1b = s.b;
ERROR:  variable not found in subplan target list

To fix it we may need to modify RelOptInfos for Path, BitmapHeapPath,
ForeignPath and CustomPath, and modify IndexOptInfos for IndexPath.  It
seems that that is not easily done without postponing reparameterization
of paths until createplan.c.

Attached is a patch which is v5 + fix for this new issue.

Thanks
Richard
Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: vignesh C
Дата:
Сообщение: Re: persist logical slots to disk during shutdown checkpoint
Следующее
От: shveta malik
Дата:
Сообщение: Re: Synchronizing slots from primary to standby