Subplan result caching
От | Heikki Linnakangas |
---|---|
Тема | Subplan result caching |
Дата | |
Msg-id | daceb327-9a20-51f4-fe6c-60b898692305@iki.fi обсуждение исходный текст |
Ответы |
Re: Subplan result caching
Re: Subplan result caching Re: Subplan result caching Re: Subplan result caching Re: Subplan result caching |
Список | pgsql-hackers |
Hi, I've been working on a patch to add a little cache to SubPlans, to speed up queries with correlated subqueries, where the same subquery is currently executed multiple times with the same parameters. The idea is to cache the result of the subplan, with the correlation vars as the cache key. That helps a lot, if you happen to have that kind of a query. I bumped into this while looking at TPC-DS query 6: select a.ca_state state, count(*) cnt from customer_address a ,customer c ,store_sales s ,date_dim d ,item i where a.ca_address_sk = c.c_current_addr_sk and c.c_customer_sk = s.ss_customer_sk and s.ss_sold_date_sk = d.d_date_sk and s.ss_item_sk = i.i_item_sk and d.d_month_seq = (select distinct (d_month_seq) from date_dim where d_year = 2000 and d_moy = 5 ) and i.i_current_price > 1.2 * (select avg(j.i_current_price) from item j where j.i_category = i.i_category) group by a.ca_state having count(*) >= 10 order by cnt The first subquery is uncorrelated, and is already handled efficiently as an InitPlan. This patch helps with the second subquery. There are only 11 different categories, but we currently re-execute it for every row of the outer query, over 26000 times. (I think I have about 1 GB of data in my little database I've been testing with, I'm not sure how this would scale with the amount of data.) With this patch, it's only executed 11 times, the cache avoids the rest of the executions. That brings the runtime, on my laptop, from about 30 s to 120 ms. For this particular query, I actually wish we could pull up that subquery, instead. I did some investigation into that last summer, https://www.postgresql.org/message-id/67e353e8-c20e-7980-a282-538779edf25b%40iki.fi, but that's a much bigger project. In any case, even if the planner was able to pull up subqueries in more cases, a cache like this would still be helpful for those cases where pulling up was still not possible. Thoughts? - Heikki
Вложения
В списке pgsql-hackers по дате отправления: