Effect of caching hash bucket size while costing
От | Srinivas Karthik V |
---|---|
Тема | Effect of caching hash bucket size while costing |
Дата | |
Msg-id | CAEfuzeSWU-V8WqnQ-uyAv85dgqvjCxoTOAoF46LEE+etTz2+rw@mail.gmail.com обсуждение исходный текст |
Ответы |
Re: [HACKERS] Effect of caching hash bucket size while costing
|
Список | pgsql-hackers |
Dear PostgreSQL Hackers,
I am working in PostgreSQL 9.4.* optimizer module. In costsize.c file and final_cost_hashjoin() function, the innerbucketsize is either:a) calculated using a cached copy
b) calculated afresh using statistics captured by the following code snippet:
thisbucketsize = estimate_hash_bucketsize(root, get_leftop(restrictinfo->clause),virtualbuckets);
thisbucketsize = estimate_hash_bucketsize(root, get_leftop(restrictinfo->clause),virtualbuckets);
For the query I used, if I disable the caching for calculating the innerbucketsize, I get a different plan with cost change of around 1000 units.
1) Can you please let me know if innerbucketsize*innerpathrows captures the maximum bucket size?
2) why is it not calculated afresh all the time?
2) why is it not calculated afresh all the time?
For reference, below is the query I am using:
explain select i_item_id, avg(cs_quantity) , avg(cs_list_price) , avg(cs_coupon_amt) , avg(cs_sales_price) from catalog_sales, customer_demographics, date_dim, item, promotion where cs_sold_date_sk = d_date_sk and cs_item_sk = i_item_sk and cs_bill_cdemo_sk = cd_demo_sk and cs_promo_sk = p_promo_sk and cd_gender = 'F' and cd_marital_status = 'U' and cd_education_status = 'Unknown' and (p_channel_email = 'N' or p_channel_event = 'N') and d_year = 2002 and i_current_price <= 100 group by i_item_id order by i_item_id
explain select i_item_id, avg(cs_quantity) , avg(cs_list_price) , avg(cs_coupon_amt) , avg(cs_sales_price) from catalog_sales, customer_demographics, date_dim, item, promotion where cs_sold_date_sk = d_date_sk and cs_item_sk = i_item_sk and cs_bill_cdemo_sk = cd_demo_sk and cs_promo_sk = p_promo_sk and cd_gender = 'F' and cd_marital_status = 'U' and cd_education_status = 'Unknown' and (p_channel_email = 'N' or p_channel_event = 'N') and d_year = 2002 and i_current_price <= 100 group by i_item_id order by i_item_id
and the hashclause which was tried was (item.i_item_sk = catalog_sales.cs_item_sk).
Thanks,
Srinivas Karthik
В списке pgsql-hackers по дате отправления: