Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500
От | Richard Guo |
---|---|
Тема | Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500 |
Дата | |
Msg-id | CAMbWs48Z_PvveOeankGA69=V=fqogusRKGVPBUGfYC0zDo1WPA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500 (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500
|
Список | pgsql-hackers |
On Fri, Nov 17, 2023 at 2:16 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
So you could argue that there's more to do here, but I'm hesitant
to go further. Part of the point of MATERIALIZED is to be an
optimization fence, so breaking down that fence is something to be
wary of. Maybe we shouldn't even take this patch --- but on
balance I think it's an OK compromise.
Agreed. I think the patch is still valuable on its own, although it
does not go down into MATERIALIZED case for further optimization. Maybe
we can take another query as regression test to prove its value, in
which the CTE is not inlined without MATERIALIZED, such as
explain (costs off)
with x as (select unique1, unique2 from tenk1 b)
select count(*) from tenk1 a
where unique1 in (select unique1 from x x1) and
unique1 in (select unique2 from x x2);
QUERY PLAN
------------------------------------------------------------------
Aggregate
CTE x
-> Seq Scan on tenk1 b
-> Hash Join
Hash Cond: (a.unique1 = x2.unique2)
-> Nested Loop
-> HashAggregate
Group Key: x1.unique1
-> CTE Scan on x x1
-> Index Only Scan using tenk1_unique1 on tenk1 a
Index Cond: (unique1 = x1.unique1)
-> Hash
-> HashAggregate
Group Key: x2.unique2
-> CTE Scan on x x2
(15 rows)
vs
explain (costs off)
with x as (select unique1, unique2 from tenk1 b)
select count(*) from tenk1 a
where unique1 in (select unique1 from x x1) and
unique1 in (select unique2 from x x2);
QUERY PLAN
------------------------------------------------------------------
Aggregate
CTE x
-> Seq Scan on tenk1 b
-> Hash Semi Join
Hash Cond: (a.unique1 = x2.unique2)
-> Hash Semi Join
Hash Cond: (a.unique1 = x1.unique1)
-> Index Only Scan using tenk1_unique1 on tenk1 a
-> Hash
-> CTE Scan on x x1
-> Hash
-> CTE Scan on x x2
(12 rows)
I believe the second plan is faster in reality too.
Thanks
Richard
does not go down into MATERIALIZED case for further optimization. Maybe
we can take another query as regression test to prove its value, in
which the CTE is not inlined without MATERIALIZED, such as
explain (costs off)
with x as (select unique1, unique2 from tenk1 b)
select count(*) from tenk1 a
where unique1 in (select unique1 from x x1) and
unique1 in (select unique2 from x x2);
QUERY PLAN
------------------------------------------------------------------
Aggregate
CTE x
-> Seq Scan on tenk1 b
-> Hash Join
Hash Cond: (a.unique1 = x2.unique2)
-> Nested Loop
-> HashAggregate
Group Key: x1.unique1
-> CTE Scan on x x1
-> Index Only Scan using tenk1_unique1 on tenk1 a
Index Cond: (unique1 = x1.unique1)
-> Hash
-> HashAggregate
Group Key: x2.unique2
-> CTE Scan on x x2
(15 rows)
vs
explain (costs off)
with x as (select unique1, unique2 from tenk1 b)
select count(*) from tenk1 a
where unique1 in (select unique1 from x x1) and
unique1 in (select unique2 from x x2);
QUERY PLAN
------------------------------------------------------------------
Aggregate
CTE x
-> Seq Scan on tenk1 b
-> Hash Semi Join
Hash Cond: (a.unique1 = x2.unique2)
-> Hash Semi Join
Hash Cond: (a.unique1 = x1.unique1)
-> Index Only Scan using tenk1_unique1 on tenk1 a
-> Hash
-> CTE Scan on x x1
-> Hash
-> CTE Scan on x x2
(12 rows)
I believe the second plan is faster in reality too.
Thanks
Richard
В списке pgsql-hackers по дате отправления: