Re: [HACKERS] Effect of caching hash bucket size while costing
От | Robert Haas |
---|---|
Тема | Re: [HACKERS] Effect of caching hash bucket size while costing |
Дата | |
Msg-id | CA+TgmoZkOZqzyDQ8dKmEAKMf5RDHoN2Rnu73wbauM4p_93Oh=Q@mail.gmail.com обсуждение исходный текст |
Ответ на | Effect of caching hash bucket size while costing (Srinivas Karthik V <skarthikv.iitb@gmail.com>) |
Ответы |
Re: [HACKERS] Effect of caching hash bucket size while costing
|
Список | pgsql-hackers |
On Thu, Dec 8, 2016 at 3:53 AM, Srinivas Karthik V <skarthikv.iitb@gmail.com> wrote: > Dear PostgreSQL Hackers, > > I am working in PostgreSQL 9.4.* optimizer module. In costsize.c file and > final_cost_hashjoin() function, the innerbucketsize is either: > > a) calculated using a cached copy > OR > b) calculated afresh using statistics captured by the following code > snippet: > thisbucketsize = estimate_hash_bucketsize(root, > get_leftop(restrictinfo->clause),virtualbuckets); > > For the query I used, if I disable the caching for calculating the > innerbucketsize, I get a different plan with cost change of around 1000 > units. > > 1) Can you please let me know if innerbucketsize*innerpathrows captures the > maximum bucket size? > 2) why is it not calculated afresh all the time? Well, #2 is answered there right in the comments: * Since we tend to visit the same clauses over and over when * planning a large query, we cache the bucketsize estimate in the * RestrictInfo node to avoid repeated lookups of statistics. I assume the person who wrote the comment thought that the answer wouldn't change from one call to the next, and therefore it was safe to cache. I don't know why that isn't the case for you. As to question #1, there's a comment for that, too, a little further down: * The number of tuple comparisons needed is the number of outer * tuples times the typicalnumber of tuples in a hash bucket, which * is the inner relation size times its bucketsize fraction. At each * one, we need to evaluate the hashjoin quals. But actually, So innerbucketsize*innerpathrows represents the expected number of comparisons that we expect to need to perform per hash probe. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-hackers по дате отправления: