Re: BUG #18365: Inconsistent cost function between materialized and non-materialized CTE
От | David Rowley |
---|---|
Тема | Re: BUG #18365: Inconsistent cost function between materialized and non-materialized CTE |
Дата | |
Msg-id | CAApHDvqBTtdtE_w__HeOvzen6UqKUVzcNpTm74_ioJOfdOJ50A@mail.gmail.com обсуждение исходный текст |
Ответ на | BUG #18365: Inconsistent cost function between materialized and non-materialized CTE (PG Bug reporting form <noreply@postgresql.org>) |
Список | pgsql-bugs |
On Tue, 27 Feb 2024 at 02:44, PG Bug reporting form <noreply@postgresql.org> wrote: > I have run into an issue where a query with a CTE performs a sequential scan > on a large table (42M rows, ~1min on our prod), while explicitly > materializing the CTE simply performs an index only scan (~2s). When I set > `enable_seqscan=off` and compare the costs, it turns out the query planner > grossly overestimates the cost of a Nested Loop compared to the (as far as I > can tell) exact same Nested Loop when the CTE is materialized. I know that > the query planner acts on heuristics, so this might not be considered a bug, > but the cost values are so wildly different for what should be essentially > the same operation, that it might warrant further investigation. > Is this a bug? I don't see any bugs. What seems to be going on is that the materialized CTE underestimates the number of rows after making the CTE distinct so the planner can use a join rather than a semi-join. That's seen in: -> HashAggregate (cost=2576.28..2679.33 rows=10305 width=16) (actual time=765.793..893.761 rows=166060 loops=1) Due to that row underestimation, the planner thinks a parameterized nested loop is the best way to join the two relations as it thinks the stock_history_date_product_id_idx index only needs to be looked up 10305 times rather than 166060 times. With the non-materialized version, the planner estimates 103051. That causes it to think the Nested Loop -> index scan on stock_history_date_product_id_idx is too expensive and hash join with a seq scan is cheaper As it turns out, looking up the index *is* faster, even when that's done 166060 times. The two parameters that drive the planner's decision on this are random_page_cost, (you might want to consider lowering that) and effective_cache_size. A rough guideline for effective_cache_size is 75% of the machine's RAM. However, it much depends on your shared_buffer setting and what type of other things run concurrently on the machine. Some people have found lowering random_page_cost as low as 1.1 helps. The default is 4.0, which has remained since the HDD days. For SSDs, it's often too large. I've attached the EXPLAINs I trimmed down and compared to reach this conclusion. David
Вложения
В списке pgsql-bugs по дате отправления: