Spatial join insists on sequential scan of larger table
От | Clive Page |
---|---|
Тема | Spatial join insists on sequential scan of larger table |
Дата | |
Msg-id | Pine.LNX.4.44L0.0404021540020.2713-100000@peneca.star.le.ac.uk обсуждение исходный текст |
Ответы |
Re: Spatial join insists on sequential scan of larger table
|
Список | pgsql-performance |
I am trying to do a spatial join between two tables each of which has a column of type BOX called ERRBOX, with R-TREE indices created on both. The smaller table, xmm1, has 56,711 rows, the larger one, twomass, has 177,757,299 rows. The most efficient way to join these is to do a sequential scan of the smaller table, and an R-tree lookup on the larger. However for a simple inner join the optimiser seems to want to do the reverse, for example: EXPLAIN SELECT x.ra AS xra, x.decl AS xdecl, t.ra AS tra, t.decl AS tdecl FROM xmm1 AS x INNER JOIN twomass AS t ON x.errbox && t.errbox; QUERY PLAN ---------------------------------------------------------------------------------- Nested Loop (cost=0.00..196642756520.34 rows=49506496044 width=32) -> Seq Scan on twomass t (cost=0.00..9560002.72 rows=177023872 width=48) -> Index Scan using xmm1box on xmm1 x (cost=0.00..1107.28 rows=280 width=48) Index Cond: (x.errbox && "outer".errbox) Reversing the join condition (i.e. t.errbox && x.errbox) and similar make no difference, nor does using the old implicit join syntax. If, however, I specify an outer join such as: EXPLAIN SELECT x.ra AS xra, x.decl AS xdecl, t.ra AS tra, t.decl AS tdecl FROM xmm1 AS x LEFT OUTER JOIN twomass AS t ON x.errbox && t.errbox; QUERY PLAN ---------------------------------------------------------------------------------------- Nested Loop Left Join (cost=0.00..198945259325.90 rows=49506496044 width=32) -> Seq Scan on xmm1 x (cost=0.00..8592.32 rows=55932 width=48) -> Index Scan using tbox on twomass t (cost=0.00..3545848.88 rows=885119 width=48) Index Cond: ("outer".errbox && t.errbox) This executes, it need hardly be said, a whole lot faster. I found that I can also force a sequential scan of the smaller table by dropping its R-tree index, but I may need this in other operations, so this isn't a very satisfactory solution. It's odd that an outer join should be faster than an inner one, or to put it another way, after dropping an index there is more than an order of magnitude speed increase. I'm using Postgres 7.4.1 on Red Hat Linux. Has anyone had similar problems with spatial joins? -- Clive Page Dept of Physics & Astronomy, University of Leicester, Leicester, LE1 7RH, U.K.
В списке pgsql-performance по дате отправления: