Обсуждение: outer joins strangeness
It may be just me, or I am grossly misunderstanding syntax of outer joins, but I see that plans for my queries are different depending on how I place join conditions and sometimes even on order of the tables. Basically, if I mix ANSI-syntax outer joins (a left outer join b on a.id=b.id) and "where-syntax" joins (from a,b where a.id=b.id) in the same query, things get strange. Example: 1: explain select * from customers c,orders o left outer join adsl_orders ao on ao.order_id=o.order_id where c.cust_id=o.cust_id and c.cust_id=152 Nested Loop (cost=94.23..577.47 rows=2 width=290) -> Index Scan using customers_pkey on customers c (cost=0.00..2.02 rows=1 width=125) -> Materialize (cost=501.65..501.65 rows=5904 width=165) -> Hash Join (cost=94.23..501.65 rows=5904width=165) -> Seq Scan on orders o (cost=0.00..131.04 rows=5904 width=58) -> Hash (cost=86.18..86.18 rows=3218 width=107) -> Seq Scan on adsl_orders ao (cost=0.00..86.18 rows=3218 width=107) Query 2: explain select * from customers c join orders o on c.cust_id=o.cust_id left outer join adsl_orders ao on ao.order_id=o.order_id where c.cust_id=152 Nested Loop (cost=0.00..9.30 rows=2 width=290) -> Nested Loop (cost=0.00..5.06 rows=2 width=183) -> Index Scanusing customers_pkey on customers c (cost=0.00..2.02 rows=1 width=125) -> Index Scan using orders_idx1 on orders o (cost=0.00..3.03 rows=1 width=58) -> Index Scan using adsl_orders_pkey on adsl_orders ao (cost=0.00..2.02 rows=1 width=107) To me, both queries seem exactly identical in meaning, and should generate the same plans. However, in my experience, if I use outer join anywhere in the query, I must use "JOIN" syntax to join all other tables as well, otherwise, my query plans are _extremely_ slow. any hints? Or I am grossly misunderstanding outer join symantics? -alex
On Sun, 23 Sep 2001, Alex Pilosov wrote: > It may be just me, or I am grossly misunderstanding syntax of outer joins, > but I see that plans for my queries are different depending on how I place > join conditions and sometimes even on order of the tables. > > Example: > 1: > explain select * from customers c,orders o left outer join adsl_orders ao > on ao.order_id=o.order_id > where c.cust_id=o.cust_id > and c.cust_id=152 > > > Nested Loop (cost=94.23..577.47 rows=2 width=290) > -> Index Scan using customers_pkey on customers c (cost=0.00..2.02 > rows=1 width=125) > -> Materialize (cost=501.65..501.65 rows=5904 width=165) > -> Hash Join (cost=94.23..501.65 rows=5904 width=165) > -> Seq Scan on orders o (cost=0.00..131.04 rows=5904 > width=58) > -> Hash (cost=86.18..86.18 rows=3218 width=107) > -> Seq Scan on adsl_orders ao (cost=0.00..86.18 > rows=3218 width=107) > > Query 2: > > explain select * from customers c join orders o on c.cust_id=o.cust_id > left outer join adsl_orders ao on ao.order_id=o.order_id > where c.cust_id=152 > > Nested Loop (cost=0.00..9.30 rows=2 width=290) > -> Nested Loop (cost=0.00..5.06 rows=2 width=183) > -> Index Scan using customers_pkey on customers c > (cost=0.00..2.02 rows=1 width=125) > -> Index Scan using orders_idx1 on orders o (cost=0.00..3.03 > rows=1 width=58) > -> Index Scan using adsl_orders_pkey on adsl_orders ao > (cost=0.00..2.02 rows=1 width=107) > > To me, both queries seem exactly identical in meaning, and should generate > the same plans. However, in my experience, if I use outer join anywhere in > the query, I must use "JOIN" syntax to join all other tables as well, > otherwise, my query plans are _extremely_ slow. Postgres treats join syntax as an explicit definition of what order to joins in. So, I'd guess it sees the first as: do the LOJ and then join that to the separate table. And for right outer join (for example), those two queries would not be equivalent if I read the ordering correctly. The former syntax would mean outer first and then the inner, whereas the second would be inner first then the outer, and that could have different results.
On Sun, 23 Sep 2001, Stephan Szabo wrote: > On Sun, 23 Sep 2001, Alex Pilosov wrote: > > > It may be just me, or I am grossly misunderstanding syntax of outer joins, > > but I see that plans for my queries are different depending on how I place > > join conditions and sometimes even on order of the tables. > > > > Example: > > 1: > > explain select * from customers c,orders o left outer join adsl_orders ao > > on ao.order_id=o.order_id > > where c.cust_id=o.cust_id > > and c.cust_id=152 <snip> > > > > explain select * from customers c join orders o on c.cust_id=o.cust_id > > left outer join adsl_orders ao on ao.order_id=o.order_id > > where c.cust_id=152 > Postgres treats join syntax as an explicit definition of what order to > joins in. So, I'd guess it sees the first as: do the LOJ and then join > that to the separate table. Yeah, I figure that's how it sees it, but that's pretty stupid from performance reasons :P) It _should_ realize that left outer join only constricts join order between two tables in outer join, and joins to all other tables should still be treated normally. I'm going to CC this to -hackers, maybe someone will shed a light on the internals of this. > And for right outer join (for example), those two queries would not > be equivalent if I read the ordering correctly. The former syntax > would mean outer first and then the inner, whereas the second would > be inner first then the outer, and that could have different results. True. But this is not right outer join, its a left outer join...:) Postgres should understand that left outer join does not constrict join order... -alex
On Mon, 24 Sep 2001, Alex Pilosov wrote: > On Sun, 23 Sep 2001, Stephan Szabo wrote: > > > On Sun, 23 Sep 2001, Alex Pilosov wrote: > > > > Postgres treats join syntax as an explicit definition of what order to > > joins in. So, I'd guess it sees the first as: do the LOJ and then join > > that to the separate table. > Yeah, I figure that's how it sees it, but that's pretty stupid from > performance reasons :P) > > It _should_ realize that left outer join only constricts join order > between two tables in outer join, and joins to all other tables should > still be treated normally. (see below) > > I'm going to CC this to -hackers, maybe someone will shed a light on the > internals of this. > > > And for right outer join (for example), those two queries would not > > be equivalent if I read the ordering correctly. The former syntax > > would mean outer first and then the inner, whereas the second would > > be inner first then the outer, and that could have different results. > True. But this is not right outer join, its a left outer join...:) > > Postgres should understand that left outer join does not constrict join > order... But it can. If your condition was a joining between the other table and the right side of the left outer join, you'd have the same condition as a right outer join and the left side. The real condition I think is that you can join a non-explicitly joined table to the <x> side of an <x> outer join before the outer join but not to the other side.
Alex Pilosov <alex@pilosoft.com> writes: > I'm going to CC this to -hackers, maybe someone will shed a light on the > internals of this. It's not unintentional. See http://www.ca.postgresql.org/users-lounge/docs/7.1/postgres/explicit-joins.html regards, tom lane