Re: parallel joins, and better parallel explain
От | Robert Haas |
---|---|
Тема | Re: parallel joins, and better parallel explain |
Дата | |
Msg-id | CA+TgmoZc00iCiZLnORB69T7hmEZ5s3vbA0pc_7xFZUHLV6y6vg@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: parallel joins, and better parallel explain (Dilip Kumar <dilipbalaut@gmail.com>) |
Ответы |
Re: parallel joins, and better parallel explain
|
Список | pgsql-hackers |
On Fri, Dec 18, 2015 at 3:54 AM, Dilip Kumar <dilipbalaut@gmail.com> wrote: > On Fri, Dec 18, 2015 at 7.59 AM Robert Haas <robertmhaas@gmail.com> Wrote, >> Uh oh. That's not supposed to happen. A GatherPath is supposed to >> have parallel_safe = false, which should prevent the planner from >> using it to form new partial paths. Is this with the latest version >> of the patch? The plan output suggests that we're somehow reaching >> try_partial_hashjoin_path() with inner_path being a GatherPath, but I >> don't immediately see how that's possible, because >> create_gather_path() sets parallel_safe to false unconditionally, and >> hash_inner_and_outer() never sets cheapest_safe_inner to a path unless >> that path is parallel_safe. > > Yes, you are right, that create_gather_path() sets parallel_safe to false > unconditionally but whenever we are building a non partial path, that time > we should carry forward the parallel_safe state to its parent, and it seems > like that part is missing here.. Ah, right. Woops. I can't exactly replicate your results, but I've attempted to fix this in a systematic way in the new version attached here (parallel-join-v3.patch). >> Do you have a self-contained test case that reproduces this, or any >> insight as to how it's happening here? > > This is TPC-H benchmark case: > we can setup like this.. > 1. git clone https://tkejser@bitbucket.org/tkejser/tpch-dbgen.git > 2. complie using make > 3. ./dbgen –v –s 5 > 4. ./qgen Thanks. After a bit of fiddling I was able to get this to work. I'm attaching two other patches that seem to help this case quite considerably. The first (parallel-reader-order-v1) cause Gather to read from the same worker repeatedly until it can't get another tuple from that worker without blocking, and only then move on to the next worker. With 4 workers, this seems to be drastically more efficient than what's currently in master - I saw the time for Q5 drop from over 17 seconds to about 6 (this was an assert-enabled build running with EXPLAIN ANALYZE, though, so take those numbers with a grain of salt). The second (gather-disuse-physical-tlist.patch) causes Gather to force underlying scan nodes to project, which is a good idea here for reasons very similar to why it's a good idea for the existing node types that use disuse_physical_tlist: forcing extra data through the Gather node is bad. That shaved another half second off this query. The exact query I was using for testing was: explain (analyze, verbose) select n_name, sum(l_extendedprice * (1 - l_discount)) as revenue from customer, orders, lineitem, supplier, nation, region where c_custkey = o_custkey and l_orderkey = o_orderkey and l_suppkey = s_suppkey and c_nationkey = s_nationkey and s_nationkey = n_nationkey and n_regionkey = r_regionkey and r_name = 'EUROPE' and o_orderdate >= date '1995-01-01' and o_orderdate < date '1995-01-01' + interval '1' year group by n_name order by revenue desc; -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Вложения
В списке pgsql-hackers по дате отправления: