Обсуждение: BUG #17769: Assert triggered in indxpath.c
The following bug has been logged on the website: Bug reference: 17769 Logged by: Robins Tharakan Email address: tharakan@gmail.com PostgreSQL version: 15.1 Operating system: Ubuntu 20.04 Description: This assert() is reproducible. It is similar to a recent bug-report [1], but reporting separately since the TRAP signature is different. TRAP: failed Assert("outer_rel->rows > 0"), File: "indxpath.c", Line: 1909, PID: 3364016 Git: 117d2604c2@master SQL / backtrace / backtrace full excerpt below. Backtrace ========= Core was generated by `postgres: 117d2604c2@master@sqith: ubuntu postgres 127.0.0.1(55644) SELECT '. Program terminated with signal SIGABRT, Aborted. #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 #0 __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 #1 0x00007f8691c5c859 in __GI_abort () at abort.c:79 #2 0x0000558e00499dfe in ExceptionalCondition (conditionName=0x558e0064c590 "outer_rel->rows > 0", fileName=0x558e0064c4cc "indxpath.c", lineNumber=1909) at assert.c:66 #3 0x0000558e0015194f in get_loop_count (root=0x558e01c056e0, cur_relid=3, outer_relids=0x7f8676a230e8) at indxpath.c:1909 #4 0x0000558e0014fe33 in build_index_paths (root=0x558e01c056e0, rel=0x558e01c201a0, index=0x558e01c2ba98, clauses=0x7ffed1004890, useful_predicate=false, scantype=ST_ANYSCAN, skip_nonnative_saop=0x7ffed10047a6, skip_lower_saop=0x7ffed10047a7) at indxpath.c:957 #5 0x0000558e0014f9b4 in get_index_paths (root=0x558e01c056e0, rel=0x558e01c201a0, index=0x558e01c2ba98, clauses=0x7ffed1004890, bitindexpaths=0x7ffed1004b20) at indxpath.c:728 #6 0x0000558e0014f83b in get_join_index_paths (root=0x558e01c056e0, rel=0x558e01c201a0, index=0x558e01c2ba98, rclauseset=0x7ffed1004c30, jclauseset=0x7ffed1004d40, eclauseset=0x7ffed1004e50, bitindexpaths=0x7ffed1004b20, relids=0x7f8676a230c8, considered_relids=0x7ffed1004ac0) at indxpath.c:665 #7 0x0000558e0014f3a5 in consider_index_join_outer_rels (root=0x558e01c056e0, rel=0x558e01c201a0, index=0x558e01c2ba98, rclauseset=0x7ffed1004c30, jclauseset=0x7ffed1004d40, eclauseset=0x7ffed1004e50, bitindexpaths=0x7ffed1004b20, indexjoinclauses=0x7f8676a23368, considered_clauses=1, considered_relids=0x7ffed1004ac0) at indxpath.c:579 #8 0x0000558e0014f1cf in consider_index_join_clauses (root=0x558e01c056e0, rel=0x558e01c201a0, index=0x558e01c2ba98, rclauseset=0x7ffed1004c30, jclauseset=0x7ffed1004d40, eclauseset=0x7ffed1004e50, bitindexpaths=0x7ffed1004b20) at indxpath.c:476 #9 0x0000558e0014eb20 in create_index_paths (root=0x558e01c056e0, rel=0x558e01c201a0) at indxpath.c:306 #10 0x0000558e0013625e in set_plain_rel_pathlist (root=0x558e01c056e0, rel=0x558e01c201a0, rte=0x558e01c05050) at allpaths.c:768 #11 0x0000558e00135dcb in set_rel_pathlist (root=0x558e01c056e0, rel=0x558e01c201a0, rti=3, rte=0x558e01c05050) at allpaths.c:484 #12 0x0000558e00135a11 in set_base_rel_pathlists (root=0x558e01c056e0) at allpaths.c:336 #13 0x0000558e00135745 in make_one_rel (root=0x558e01c056e0, joinlist=0x7f8676a224c8) at allpaths.c:206 #14 0x0000558e00175ae4 in query_planner (root=0x558e01c056e0, qp_callback=0x558e0017c30f <standard_qp_callback>, qp_extra=0x7ffed1005210) at planmain.c:278 #15 0x0000558e00178556 in grouping_planner (root=0x558e01c056e0, tuple_fraction=0) at planner.c:1496 #16 0x0000558e00177c05 in subquery_planner (glob=0x558e01c04e30, parse=0x558e01aa9238, parent_root=0x0, hasRecursion=false, tuple_fraction=0) at planner.c:1065 #17 0x0000558e001761bd in standard_planner (parse=0x558e01aa9238, query_string=0x558e01aa7f38 "SELECT\nFROM pg_catalog.pg_statio_all_tables AS ref_0,\n LATERAL (SELECT\n", ' ' <repeats 14 times>, "WHERE ref_0.schemaname = ref_0.relname) AS subq_0\n ;", cursorOptions=2048, boundParams=0x0) at planner.c:411 Backtrace full excerpt ====================== #2 0x0000558e00499dfe in ExceptionalCondition (conditionName=0x558e0064c590 "outer_rel->rows > 0", fileName=0x558e0064c4cc "indxpath.c", lineNumber=1909) at assert.c:66 No locals. #3 0x0000558e0015194f in get_loop_count (root=0x558e01c056e0, cur_relid=3, outer_relids=0x7f8676a230e8) at indxpath.c:1909 outer_rel = 0x558e01c2cf58 rowcount = 9.4860604001519336e-322 result = 0 outer_relid = 6 #4 0x0000558e0014fe33 in build_index_paths (root=0x558e01c056e0, rel=0x558e01c201a0, index=0x558e01c2ba98, clauses=0x7ffed1004890, useful_predicate=false, scantype=ST_ANYSCAN, skip_nonnative_saop=0x7ffed10047a6, skip_lower_saop=0x7ffed10047a7) at indxpath.c:957 result = 0x0 ipath = 0x558e004dff49 <palloc+278> index_clauses = 0x7f8676a233c8 outer_relids = 0x7f8676a230e8 loop_count = 4.6475954467446791e-310 orderbyclauses = 0x0 orderbyclausecols = 0x0 index_pathkeys = 0x7ffed1004790 useful_pathkeys = 0x6491e3bad6b60900 found_lower_saop_clause = false pathkeys_possibly_useful = false index_is_ordered = false index_only_scan = false indexcol = 2 #5 0x0000558e0014f9b4 in get_index_paths (root=0x558e01c056e0, rel=0x558e01c201a0, index=0x558e01c2ba98, clauses=0x7ffed1004890, bitindexpaths=0x7ffed1004b20) at indxpath.c:728 indexpaths = 0x100000001 skip_nonnative_saop = false skip_lower_saop = false lc = 0x7ffed10047e0 #6 0x0000558e0014f83b in get_join_index_paths (root=0x558e01c056e0, rel=0x558e01c201a0, index=0x558e01c2ba98, rclauseset=0x7ffed1004c30, jclauseset=0x7ffed1004d40, eclauseset=0x7ffed1004e50, bitindexpaths=0x7ffed1004b20, relids=0x7f8676a230c8, considered_relids=0x7ffed1004ac0) at indxpath.c:665 clauseset = {nonempty = true, indexclauses = {0x7f8676a23398, 0x0 <repeats 31 times>}} indexcol = 2 #7 0x0000558e0014f3a5 in consider_index_join_outer_rels (root=0x558e01c056e0, rel=0x558e01c201a0, index=0x558e01c2ba98, rclauseset=0x7ffed1004c30, jclauseset=0x7ffed1004d40, eclauseset=0x7ffed1004e50, bitindexpaths=0x7ffed1004b20, indexjoinclauses=0x7f8676a23368, considered_clauses=1, considered_relids=0x7ffed1004ac0) at indxpath.c:579 iclause = 0x7f8676a23108 clause_relids = 0x7f8676a230c8 parent_ec = 0x7f8676a209a8 num_considered_relids = 0 lc__state = {l = 0x7f8676a23368, i = 0} lc = 0x7f8676a23380 SQL === SELECT FROM pg_catalog.pg_statio_all_tables AS ref_0, LATERAL (SELECT WHERE ref_0.schemaname = ref_0.relname) AS subq_0; Thanks to SQLSmith / SQLReduce for the find. Reference: 1. https://www.postgresql.org/message-id/flat/CAMbWs490BcsZqXF%2BfoXiQ0zdw-FffZJA6%2BvBNYtJF0vkDyJ98g%40mail.gmail.com#1a3b983d61aeb5b76ac7f4012da5f7ea - Robins Tharakan Amazon Web Services
On Thu, Feb 2, 2023 at 8:13 PM PG Bug reporting form <noreply@postgresql.org> wrote:
SQL
===
SELECT
FROM pg_catalog.pg_statio_all_tables AS ref_0,
LATERAL (SELECT
WHERE ref_0.schemaname = ref_0.relname) AS subq_0;
Thanks for the report! I can reproduce this issue with a simplified
query.
SELECT FROM
(SELECT n.nspname AS schemaname, c.relname
FROM pg_class c LEFT JOIN pg_namespace n
ON n.oid = c.relnamespace) AS ref_0,
LATERAL (SELECT WHERE ref_0.schemaname = ref_0.relname) AS subq_0;
I believe what happens here is that after we pull up the LATERAL
subquery we would have a FromExpr with only one child. Then
remove_useless_result_rtes elides it and merges its qual, which is
'n.nspname = c.relname', up to its parent. This qual is then handed to
EC machinery.
When it comes to remove_useless_joins, we try to remove the LEFT join.
We are supposed to remove any joinquals referencing rel 'n'. But here
remove_rel_from_query only checks rel->joininfo, and neglects to notice
that there is another joinqual referencing rel 'n' existing in EC. I
think this is where the problem arises.
Another finding is that in remove_useless_result_rtes after we elide the
single-child FromExpr and merge its qual up to its parent, the LEFT join
is actually can be reduced to inner join. But we fail to perform that
since all this happens after we've done outer join reduction.
Thanks
Richard
query.
SELECT FROM
(SELECT n.nspname AS schemaname, c.relname
FROM pg_class c LEFT JOIN pg_namespace n
ON n.oid = c.relnamespace) AS ref_0,
LATERAL (SELECT WHERE ref_0.schemaname = ref_0.relname) AS subq_0;
I believe what happens here is that after we pull up the LATERAL
subquery we would have a FromExpr with only one child. Then
remove_useless_result_rtes elides it and merges its qual, which is
'n.nspname = c.relname', up to its parent. This qual is then handed to
EC machinery.
When it comes to remove_useless_joins, we try to remove the LEFT join.
We are supposed to remove any joinquals referencing rel 'n'. But here
remove_rel_from_query only checks rel->joininfo, and neglects to notice
that there is another joinqual referencing rel 'n' existing in EC. I
think this is where the problem arises.
Another finding is that in remove_useless_result_rtes after we elide the
single-child FromExpr and merge its qual up to its parent, the LEFT join
is actually can be reduced to inner join. But we fail to perform that
since all this happens after we've done outer join reduction.
Thanks
Richard
Richard Guo <guofenglinux@gmail.com> writes: > I believe what happens here is that after we pull up the LATERAL > subquery we would have a FromExpr with only one child. Then > remove_useless_result_rtes elides it and merges its qual, which is > 'n.nspname = c.relname', up to its parent. This qual is then handed to > EC machinery. > When it comes to remove_useless_joins, we try to remove the LEFT join. > We are supposed to remove any joinquals referencing rel 'n'. But here > remove_rel_from_query only checks rel->joininfo, and neglects to notice > that there is another joinqual referencing rel 'n' existing in EC. I > think this is where the problem arises. Actually, the problem is that we *shouldn't* try to remove the left join, because the reference to ref_0.schemaname is logically "above" that join. I messed up the test for that in join_is_removable(). Apparently, we had no test cases that would expose such a condition, probably because such a reference would typically cause strength-reducing the left join. > Another finding is that in remove_useless_result_rtes after we elide the > single-child FromExpr and merge its qual up to its parent, the LEFT join > is actually can be reduced to inner join. But we fail to perform that > since all this happens after we've done outer join reduction. Yeah. If we *did* do that then this bug would have escaped detection awhile longer :-(. I'm not terribly concerned about the lack of such an optimization, but maybe we can look at it someday. Fix pushed, thanks for investigating! regards, tom lane