Re: Getting sorted data from foreign server
| От | Robert Haas |
|---|---|
| Тема | Re: Getting sorted data from foreign server |
| Дата | |
| Msg-id | CA+TgmoYbO2TZ3JQVdrsLUXjc1YWeZEGETw3Q1cvn4GnbY81Ymw@mail.gmail.com обсуждение исходный текст |
| Ответ на | Re: Getting sorted data from foreign server (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>) |
| Ответы |
Re: Getting sorted data from foreign server
|
| Список | pgsql-hackers |
On Thu, Oct 15, 2015 at 6:28 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> Attached is the patch which takes care of above comments.
I spent some time on this patch today. But it's still not right.
I've attached a new version which fixes a serious problem with your
last version - having postgresGetForeignPaths do the costing of the
sorted path itself instead of delegating that to
estimate_path_cost_size is wrong. In your version, 10% increment gets
applied to the network transmission costs as well as the cost of
generating the tupes - but only when use_remote_estimate == false. I
fixed this and did some cosmetic cleanup.
But you'll notice if you try this some of postgres_fdw's regression
tests fail. This is rather mysterious:
***************
*** 697,715 ****
Sort
Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
Sort Key: t1.c1
! -> Nested Loop Semi Join
Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
! Join Filter: (t1.c3 = t2.c3)
-> Foreign Scan on public.ft1 t1
Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8
FROM "S 1"."T 1" WHERE (("C 1" < 20))
! -> Materialize
Output: t2.c3
! -> Foreign Scan on public.ft2 t2
Output: t2.c3
! Filter: (date(t2.c4) = '01-17-1970'::date)
! Remote SQL: SELECT c3, c4 FROM "S 1"."T 1"
WHERE (("C 1" > 10))
! (15 rows)
EXECUTE st2(10, 20);
c1 | c2 | c3 | c4 | c5
| c6 | c7 | c8
--- 697,718 ----
Sort
Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
Sort Key: t1.c1
! -> Hash Join
Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
! Hash Cond: (t1.c3 = t2.c3)
-> Foreign Scan on public.ft1 t1
Output: t1.c1, t1.c2, t1.c3, t1.c4, t1.c5, t1.c6, t1.c7, t1.c8
Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8
FROM "S 1"."T 1" WHERE (("C 1" < 20))
! -> Hash
Output: t2.c3
! -> HashAggregate
Output: t2.c3
! Group Key: t2.c3
! -> Foreign Scan on public.ft2 t2
! Output: t2.c3
! Filter: (date(t2.c4) = '01-17-1970'::date)
! Remote SQL: SELECT c3, c4 FROM "S 1"."T
1" WHERE (("C 1" > 10))
! (18 rows)
What I think is happening here is that the planner notices that
instead of doing a parameterized nestloop, it could pull down the data
already sorted from the remote side, cheaply unique-ify it by using
the ordering provided by the remote side, and then do a standard hash
join. That might well be a sensible approach, but the ORDER BY that
would make it correct doesn't show up in the Remote SQL. I don't know
why that's happening, but it's not good.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Вложения
В списке pgsql-hackers по дате отправления: