Обсуждение: Re: Our trial to TPC-DS but optimizer made unreasonable plan
> On Mon, Aug 17, 2015 at 6:40 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote: > > Here is one other thing I could learn from TPC-DS benchmark. > > > > The attached query is Q4 of TPC-DS, and its result was towards SF=100. > > It took long time to compete (about 30min), please see the attached > > EXPLAIN ANALYZE output. > > > Look at this: > > -> CTE Scan on year_total t_s_firstyear (cost=0.00..13120715.27 > rows=3976 width=52) (actual time=0.020..5425.980 rows=1816438 loops=1) > Filter: ((year_total > '0'::numeric) AND > (sale_type = 's'::text) AND (dyear = 2001)) > Rows Removed by Filter: 19879897 > -> CTE Scan on year_total t_s_secyear (cost=0.00..11927922.98 > rows=11928 width=164) (actual time=0.007..45.249 rows=46636 loops=1) > Filter: ((sale_type = 's'::text) AND (dyear = 2002)) > Rows Removed by Filter: 185596 > > CTE expansion shall help here as we can push the filer down. I did a > quick patch to demonstrate the idea, following Tom's proposal > (38448.1430519406@sss.pgh.pa.us). I see obvious performance boost: > > Turn off NLJ: > original: Planning time: 4.391 ms > Execution time: 77113.721 ms > patched: Planning time: 8.429 ms > Execution time: 18572.663 ms > > + work_mem to 1G > original: Planning time: 4.487 ms > Execution time: 29249.466 ms > patched: Planning time: 11.148 ms > Execution time: 7309.586 ms > > Attached please find the WIP patch and also the ANALYZE results. > Notes: the patch may not directly apply to head as some network issue > here so my Linux box can't talk to git server. > Thanks for your patch. Let me test and report the result in my environment. BTW, did you register the patch on the upcoming commit-fest? I think it may be a helpful feature, if we can add alternative subquery-path towards cte-scan on set_cte_pathlist() and choose them according to the cost estimation. Best regards, -- NEC Business Creation Division / PG-Strom Project KaiGai Kohei <kaigai@ak.jp.nec.com>
On Tue, Aug 18, 2015 at 5:59 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote: > BTW, did you register the patch on the upcoming commit-fest? > Not yet, it is in WIP status. > I think it may be a helpful feature, if we can add alternative > subquery-path towards cte-scan on set_cte_pathlist() and choose > them according to the cost estimation. > Are you suggesting that we keep both subquery-path (whenever possible) and cte-path so that optimizer can choose among them? I could imagine that if we could support "materialize cte once and use multiple times" execution, then we shall be able to benefit from keeping cte-path. But seems we still don't support this execution mode. Regards, Qingqing
On Wed, Aug 19, 2015 at 10:32 AM, Qingqing Zhou
<zhouqq.postgres@gmail.com> wrote:
> On Tue, Aug 18, 2015 at 5:59 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
>> BTW, did you register the patch on the upcoming commit-fest?
>>
> Not yet, it is in WIP status.
>
While I am working on the patch, I found some issues and resort help
here. Patch attached.
Here is an example:
postgres=# explain WITH q AS (
WITH p AS (SELECT * from a) SELECT p.* FROM p JOIN p p1 on p.i>=p1.i)
SELECT * FROM q WHERE i <= 5;
QUERY PLAN
----------------------------------------------------------------------------------
Nested Loop (cost=0.58..5980.16 rows=133333 width=8)
-> Index Scan using ai on a (cost=0.29..8.36 rows=4 width=8)
Index Cond: (i <= 5)
-> Index Only Scan using ai on a a_1 (cost=0.29..1159.62
rows=33333 width=4)
Index Cond: (i <= a.i)
(5 rows)
So far so good. But if we add other references of the CTE q (m1->m,
m->q), we still have some extra CTE scans:
postgres=# explain WITH q AS (
WITH p AS (SELECT * from a) SELECT p.* FROM p JOIN p p1 on
p.i>=p1.i), m as (select * from q), m1 as (select * from m)
SELECT * FROM m1 WHERE i <= 5;
QUERY PLAN
-----------------------------------------------------------------------------------------
CTE Scan on m (cost=158365985.66..233365985.65 rows=1111111111 width=8)
Filter: (i <= 5)
CTE q
-> Nested Loop (cost=0.29..91699319.00 rows=3333333333 width=8)
-> Seq Scan on a (cost=0.00..1443.00 rows=100000 width=8)
-> Index Only Scan using ai on a a_1 (cost=0.29..583.65
rows=33333 width=4)
Index Cond: (i <= a.i)
CTE m
-> CTE Scan on q (cost=0.00..66666666.66 rows=3333333333 width=8)
(9 rows)
Above two queries essentially the same, but the second one is a
non-optimal plan. The reason is that how my patch works: it put a
substitution in front of SS_process_ctes():
/*
* If there is a WITH list, process each WITH query and build an initplan
! * SubPlan structure for it. Before we process ctes, try to subsitute with
! * subqueries to benefits from global optimization.
*/
if (parse->cteList)
+ {
+ substitute_ctes_with_subqueries(root);
SS_process_ctes(root);
+ }
AFAICS, the substitution only handles cteList within a query block, so
it does not go across the subquery boundary. I can see this is an
issue but can't see a nice way to fix it. Anybody has some recipe?
Regards,
Qingqing
Вложения
Qingqing Zhou <zhouqq.postgres@gmail.com> writes:
> Above two queries essentially the same, but the second one is a
> non-optimal plan. The reason is that how my patch works: it put a
> substitution in front of SS_process_ctes():
> /*
> * If there is a WITH list, process each WITH query and build an initplan
> ! * SubPlan structure for it. Before we process ctes, try to subsitute with
> ! * subqueries to benefits from global optimization.
> */
> if (parse->cteList)
> + {
> + substitute_ctes_with_subqueries(root);
> SS_process_ctes(root);
> + }
> AFAICS, the substitution only handles cteList within a query block, so
> it does not go across the subquery boundary. I can see this is an
> issue but can't see a nice way to fix it. Anybody has some recipe?
It seems like you're doing this in fundamentally the wrong place.
What I had in mind in <38448.1430519406@sss.pgh.pa.us> was to convert CTEs
into plain subqueries during the prepjointree phase, either just before
or as part of the pull_up_subqueries pass (since you'd want the converted
subquery to be flattened if possible). If you do it later than that,
then you'll have to reinvent a whole bunch of wheels to provide behavior
similar to regular subquery optimization.
regards, tom lane
I wrote:
> What I had in mind in <38448.1430519406@sss.pgh.pa.us> was to convert CTEs
> into plain subqueries during the prepjointree phase, either just before
> or as part of the pull_up_subqueries pass (since you'd want the converted
> subquery to be flattened if possible).
After looking at the code a bit, IMO the most reasonable thing to do is to
include this transformation in inline_set_returning_functions(), perhaps
renaming it to something like inline_srfs_and_ctes(). You could invent
a separate function instead, but that would require an extra pass over
the rangetable, for no particular benefit that I can see; the separate
function would have to be called from *exactly* the same set of places
as inline_set_returning_functions(), anyway, or it would not work right
for multiple levels of query optimization.
regards, tom lane
> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Sent: Thursday, August 27, 2015 9:03 AM
> To: Qingqing Zhou
> Cc: Kaigai Kouhei(海外 浩平); Greg Stark; PostgreSQL-development
> Subject: Re: [HACKERS] Our trial to TPC-DS but optimizer made unreasonable plan
>
> Qingqing Zhou <zhouqq.postgres@gmail.com> writes:
> > Above two queries essentially the same, but the second one is a
> > non-optimal plan. The reason is that how my patch works: it put a
> > substitution in front of SS_process_ctes():
>
> > /*
> > * If there is a WITH list, process each WITH query and build an initplan
> > ! * SubPlan structure for it. Before we process ctes, try to subsitute with
> > ! * subqueries to benefits from global optimization.
> > */
> > if (parse->cteList)
> > + {
> > + substitute_ctes_with_subqueries(root);
> > SS_process_ctes(root);
> > + }
>
> > AFAICS, the substitution only handles cteList within a query block, so
> > it does not go across the subquery boundary. I can see this is an
> > issue but can't see a nice way to fix it. Anybody has some recipe?
>
> It seems like you're doing this in fundamentally the wrong place.
>
> What I had in mind in <38448.1430519406@sss.pgh.pa.us> was to convert CTEs
> into plain subqueries during the prepjointree phase, either just before
> or as part of the pull_up_subqueries pass (since you'd want the converted
> subquery to be flattened if possible). If you do it later than that,
> then you'll have to reinvent a whole bunch of wheels to provide behavior
> similar to regular subquery optimization.
>
Hmm... My suggestion might not be reasonable. Sorry.
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>
On Wed, Aug 26, 2015 at 5:28 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> After looking at the code a bit, IMO the most reasonable thing to do is to
> include this transformation in inline_set_returning_functions(), perhaps
> renaming it to something like inline_srfs_and_ctes().
>
This is essentially the same as my current implementation (revised
patch attached):
1. There are two call sites of inline_set_returning_functions(), and
one place is guarded with Assert(subquery->cteList == NIL). This means
transformation in subquery_planner() is effective.
2. A problem with revised patch is that we can't get rid of non-used
CTEs show up in EXPLAIN.
IMHO, here the problem is not "multiple levels" but "multiple
references". "levels" is handled well by recursion but references are
not: set returning function seems does not have the this issue because
you don't define a function along the query.
Regards,
Qingqing
---
Two testing queries results with revised patch:
1. Extra CTE q and p prints in EXPLAIN:
postgres=# explain WITH q AS (
postgres(# WITH p AS (SELECT * from a) SELECT p.* FROM p JOIN p p1
on p.i>=p1.i)
postgres-# SELECT * FROM q WHERE i <= 5;
QUERY PLAN
-----------------------------------------------------------------------------------------
Nested Loop (cost=1443.59..7423.16 rows=133333 width=8)
CTE q
-> Nested Loop (cost=1443.29..91700762.00 rows=3333333333 width=8)
CTE p
-> Seq Scan on a a_2 (cost=0.00..1443.00 rows=100000 width=8)
-> Seq Scan on a a_3 (cost=0.00..1443.00 rows=100000 width=8)
-> Index Only Scan using ai on a a_4 (cost=0.29..583.65
rows=33333 width=4)
Index Cond: (i <= a_3.i)
CTE p
-> Seq Scan on a a_5 (cost=0.00..1443.00 rows=100000 width=8)
-> Index Scan using ai on a (cost=0.29..8.36 rows=4 width=8)
Index Cond: (i <= 5)
-> Index Only Scan using ai on a a_1 (cost=0.29..1159.62
rows=33333 width=4)
Index Cond: (i <= a.i)
(14 rows)
2. Extra m1 show up and same problem still there:
postgres=# explain WITH q AS (
postgres(# WITH p AS (SELECT * from a) SELECT p.* FROM p JOIN p p1 on
postgres(# p.i>=p1.i), m as (select * from q), m1 as (select * from m)
postgres-# SELECT * FROM m1 WHERE i <= 5;
QUERY PLAN
-----------------------------------------------------------------------------------------
CTE Scan on m (cost=225034095.32..300034095.31 rows=1111111111 width=8)
Filter: (i <= 5)
CTE q
-> Nested Loop (cost=1443.29..91700762.00 rows=3333333333 width=8)
CTE p
-> Seq Scan on a (cost=0.00..1443.00 rows=100000 width=8)
-> Seq Scan on a a_1 (cost=0.00..1443.00 rows=100000 width=8)
-> Index Only Scan using ai on a a_2 (cost=0.29..583.65
rows=33333 width=4)
Index Cond: (i <= a_1.i)
CTE m
-> CTE Scan on q (cost=0.00..66666666.66 rows=3333333333 width=8)
CTE m1
-> CTE Scan on m m_1 (cost=0.00..66666666.66 rows=3333333333 width=8)
(13 rows)
Вложения
On Thu, Aug 27, 2015 at 1:01 PM, Qingqing Zhou <zhouqq.postgres@gmail.com> wrote: > On Wed, Aug 26, 2015 at 5:28 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> >> After looking at the code a bit, IMO the most reasonable thing to do is to >> include this transformation in inline_set_returning_functions(), perhaps >> renaming it to something like inline_srfs_and_ctes(). >> > > This is essentially the same as my current implementation (revised > patch attached): > I tried the method as Tom suggested (attached in previous email) but still got the same issue - anybody see what I did wrong here? :-( Thanks, Qingqing