Re: [SQL] Subselect performance
От | Tom Lane |
---|---|
Тема | Re: [SQL] Subselect performance |
Дата | |
Msg-id | 21555.937923332@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Subselect performance (Daniel Lopez <ridruejo@atm9.com.dtu.dk>) |
Ответы |
Re: [SQL] Subselect performance
|
Список | pgsql-sql |
Daniel Lopez <ridruejo@atm9.com.dtu.dk> writes: > $list = select d from c > select b from a where b in ( $list ) > is 5 seconds > select b from a where b in (select d from c) > is 3 minutes!! (although it should be at least as fast as a)! Not necessarily. Your first example is depending on the fact that the "list" (number of values selected from c) is short. Try it with 10000 or 100000 values from c, if you want to see the backend crash ;-) The second case works OK even if the sub-select result is large, because it re-executes the sub-select for each row from a (essentially, rescanning c to see if b is matched). However, this means the runtime is proportional to the product of the number of rows in a and c. Ugh. Try rewriting your query as a join: select a.b from a, c where a.b = c.d (you might want "select distinct" here, if b can match many rows from d). If the system can't figure out anything better than a nested-loop join, then it'll probably end up taking about the same amount of time, but this form at least allows the possibility of a smarter join method. I believe we have a TODO list item to perform this sort of transformation automatically when the sub-select is of a form that allows it. We need to get the left/outer join stuff working first in order to have an exact match of the behavior. regards, tom lane
В списке pgsql-sql по дате отправления: