Re: parallel joins, and better parallel explain

Поиск

Список

Период

Сортировка

От	Amit Kapila
Тема	Re: parallel joins, and better parallel explain
Дата	17 декабря 2015 г. 05:33:33
Msg-id	CAA4eK1LDnmGbjrkijCCR-w1x20MNGwZ2WyQviK-XSRDuPonceQ@mail.gmail.com обсуждение исходный текст
Ответ на	Re: parallel joins, and better parallel explain (Dilip Kumar <dilipbalaut@gmail.com>)
Ответы	Re: parallel joins, and better parallel explain Re: parallel joins, and better parallel explain
Список	pgsql-hackers

Дерево обсуждения

On Wed, Dec 16, 2015 at 9:55 PM, Dilip Kumar <dilipbalaut@gmail.com> wrote:

On Wed, Dec 16, 2015 at 6:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:

>On Tue, Dec 15, 2015 at 7:31 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>
>> On Mon, Dec 14, 2015 at 8:38 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

> In any case,
>I have done some more investigation of the patch and found that even
>without changing query planner related parameters, it seems to give
>bad plans (as in example below [1]). I think here the costing of rework each

I have done some more testing using TPC-H benchmark (For some of the queries, specially for Parallel Hash Join), and Results summary is as below.

Planning Time(ms)
Query Base Patch
TPC-H Q2 2.2 2.4
TPCH- Q3 0.67 0.71
TPCH- Q5 3.17 2.3
TPCH- Q7 2.43 2.4

Execution Time(ms)
Query Base Patch
TPC-H Q2 2826 766
TPCH- Q3 23473 24271
TPCH- Q5 21357 1432
TPCH- Q7 6779 1138

All Test files and Detail plan output is attached in mail
q2.sql, q3.sql, q.5.sql ans q7.sql are TPCH benchmark' 2nd, 3rd, 5th and 7th query
and Results with base and Parallel join are attached in q*_base.out and q*_parallel.out respectively.

Summary: With TPC-H queries where ever Hash Join is pushed under gather Node, significant improvement is visible,
with Q2, using 3 workers, time consumed is almost 1/3 of the base.

I Observed one problem, with Q5 and Q7, there some relation and snapshot references are leaked and i am getting below warning, havn't yet looked into the issue.

While looking at plans of Q5 and Q7, I have observed that Gather is

pushed below another Gather node for which we don't have appropriate

way of dealing. I think that could be the reason why you are seeing

the errors.

Also, I think it would be good if you can once check the plan/execution

time with max_parallel_degree=0 as that can give us base reference

data without parallelism, also I am wondering if have you have changed

any other parallel cost related parameter?

With Regards,
Amit Kapila.

EnterpriseDB: http://www.enterprisedb.com

В списке pgsql-hackers по дате отправления:

	Planning Time(ms)
Query	Base	Patch
TPC-H Q2	2.2	2.4
TPCH- Q3	0.67	0.71
TPCH- Q5	3.17	2.3
TPCH- Q7	2.43	2.4

	Execution Time(ms)
Query	Base	Patch
TPC-H Q2	2826	766
TPCH- Q3	23473	24271
TPCH- Q5	21357	1432
TPCH- Q7	6779	1138

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: parallel joins, and better parallel explain