Join push-down for foreign tables
От | Shigeru Hanada |
---|---|
Тема | Join push-down for foreign tables |
Дата | |
Msg-id | 4E5BA831.5050600@gmail.com обсуждение исходный текст |
Ответы |
Re: Join push-down for foreign tables
|
Список | pgsql-hackers |
Hi all, I'd like to develop pushing down JOIN between foreign tables which are on one foreign server, to enhance performance of joining foreign tables by reducing data transfer. This would need many changes in several part of PG such as planner, executor and FDW API, so please let me describe my idea first. Some of below are taken from discussions in late 2010 to early 2011 about FDW API. Basics of planning foreign join =============================== In current implementation, planner generates paths for every join combination candidate, and choose cheapest one for actual execution, but they can't be pushed down to foreign server even if both inner and outer of a join were on a foreign server. To follow mechanism of cost-based planning, pushing down join between foreign tables would need new path node, say ForeignJoinPath, is necessary to represent candidate of pushed-down join. Without this kind of node, every join node types would need to care relkind of children recursively, and switch what to do along it. A ForeignJoinPath can be used to join any of ForeignScanPath and/or ForeignJoinPath. This rule can be applied recursively. New ForeignJoinPath node would not have any sort key, at least in first version, because collation would make the issue too complex. reuse ForeignScan vs new ForeignJoin ==================================== For symmetry, ForeignJoin plan node should be added, and used to represent a foreign scan which includes join between foreign tables. But I'm not sure that adding new planner node is better. Should we enhance ForeignScan to represent this kind of plan? Cost estimation =============== Costs of ForeignJoinPath are estimated by FDW via new routine PlanForeignJoin, and SQL based FDW would need to generate remote SQL here. If a FDW can't push down that join, then it can set disable_cost (1.0e10) to tell planner to not choose that path. Typically, planner would generate NestPath, MergePath, HashPath and ForeignJoinPath for a pair of joined foreign tables if they are on same foreign server. If they were on different servers, ForeignJoinPath would not be generated. In this design, cost of ForeignJoinPath is compared to other join nodes such as NestPath and MergePath. If ForeignJoinPath is the cheapest one among the join candidates, planner will generates ForeignJoin plan node and put it into plan tree as a leaf node. In other words, joined foreign tables are merged into upper ForeignJoin node. Any comments are welcome. Regards, -- Shigeru Hanada
В списке pgsql-hackers по дате отправления: