On Thu, 27 Jul 2023 at 20:49, Ba Jinsheng <bajinsheng@u.nus.edu> wrote:
> Both query plans include different numbers of table scans, as highlighted in red color. PostgreSQL uses six table
scans,while TiDB has only three. I understand that the table scanning operation is expensive and query plans are
typicallymore efficient with fewer table scans. My question is why PostgreSQL uses six table scans to scan each table
twice?Is it a more efficient query plan, or does this indicate an optimization that is not performed by PostgreSQL?
The PostgreSQL planner does not do any scan deduplication like this.
You could likely write a query containing a WITH MATERALIZE that runs
the query with the GROUP BY, then reference the CTE in both the main
query and also the HAVING clause. e.g, something like:
explain with cte as materialized (select ps_partkey, sum(ps_supplycost
* ps_availqty) as cost from partsupp ... other joins...) select * from
cte having cost > (select sum(cost) from cte);
How much more efficient that'll be will depend on the number of distinct parts.
David