Re: parallel joins, and better parallel explain
От | Amit Kapila |
---|---|
Тема | Re: parallel joins, and better parallel explain |
Дата | |
Msg-id | CAA4eK1LDnmGbjrkijCCR-w1x20MNGwZ2WyQviK-XSRDuPonceQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: parallel joins, and better parallel explain (Dilip Kumar <dilipbalaut@gmail.com>) |
Ответы |
Re: parallel joins, and better parallel explain
Re: parallel joins, and better parallel explain |
Список | pgsql-hackers |
On Wed, Dec 16, 2015 at 9:55 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote:
With Regards,
Amit Kapila.
On Wed, Dec 16, 2015 at 6:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>On Tue, Dec 15, 2015 at 7:31 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>
>> On Mon, Dec 14, 2015 at 8:38 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> In any case,>I have done some more investigation of the patch and found that even>without changing query planner related parameters, it seems to give>bad plans (as in example below [1]). I think here the costing of rework each
I have done some more testing using TPC-H benchmark (For some of the queries, specially for Parallel Hash Join), and Results summary is as below.
Planning Time(ms) Query Base Patch TPC-H Q2 2.2 2.4 TPCH- Q3 0.67 0.71 TPCH- Q5 3.17 2.3 TPCH- Q7 2.43 2.4 Execution Time(ms) Query Base Patch TPC-H Q2 2826 766 TPCH- Q3 23473 24271 TPCH- Q5 21357 1432 TPCH- Q7 6779 1138 All Test files and Detail plan output is attached in mailq2.sql, q3.sql, q.5.sql ans q7.sql are TPCH benchmark' 2nd, 3rd, 5th and 7th queryand Results with base and Parallel join are attached in q*_base.out and q*_parallel.out respectively.Summary: With TPC-H queries where ever Hash Join is pushed under gather Node, significant improvement is visible,with Q2, using 3 workers, time consumed is almost 1/3 of the base.
I Observed one problem, with Q5 and Q7, there some relation and snapshot references are leaked and i am getting below warning, havn't yet looked into the issue.
While looking at plans of Q5 and Q7, I have observed that Gather is
pushed below another Gather node for which we don't have appropriate
way of dealing. I think that could be the reason why you are seeing
the errors.
Also, I think it would be good if you can once check the plan/execution
time with max_parallel_degree=0 as that can give us base reference
data without parallelism, also I am wondering if have you have changed
any other parallel cost related parameter?
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
В списке pgsql-hackers по дате отправления: