Re: EXISTS vs IN vs OUTER JOINS
От | Josh Berkus |
---|---|
Тема | Re: EXISTS vs IN vs OUTER JOINS |
Дата | |
Msg-id | web-2291801@davinci.ethosmedia.com обсуждение исходный текст |
Ответ на | EXISTS vs IN vs OUTER JOINS (Tomasz Myrta <jasiek@klaster.net>) |
Ответы |
Re: EXISTS vs IN vs OUTER JOINS
Re: EXISTS vs IN vs OUTER JOINS |
Список | pgsql-performance |
Tomasz, > Few days ago I read, that EXISTS is better than IN, but only if there > are many records (how many?). I was wondering which one is better and > when. Did anyone try to compare these queries doing the same work: > > - select * from some_table t > where t.id [not] in (select id from filter); > -select * from some_table t > where [not] exists (select * from filter where id=t.id); The rule I use is: if I expect the sub-select to return more than 12 records 20% or more of the time, use EXISTS. The speed gain for IN on small lists is not as dramatic as the speed loss for EXISTS on large lists. More importantly, the difference between NOT IN and NOT EXISTS can be as much as 20:1 on large sub-selects, as opposed to IN and EXISTS, where I have rarely seen a difference of more than 3:1. As I understand it, this is because NOT EXISTS can use optimized join algorithms to locate matching rows, whereas NOT IN must compare each row against every possible matching value in the subselect. It also makes a difference whether or not the referenced field(s) in the subselect is indexed. EXISTS will often use an index to compare the values in the master query to the sub-query. As far as I know, IN can use an index to retrieve the subquery values, but not to sort or compare them after they have been retreived into memory. > -select * from some_table t > left join filter f using (id) > where f.id is [not] null; This will not get you the same result as the above. It will get you all rows from t+f where a record is present in f and has (or does not have) a NULL value for f.id. While this type of query works in MS Access, it will not work in SQL92/99-commpliant databases. Incidentally, the dramatic differences between IN and EXISTS are not only a "PostgreSQL Thing". The same rules apply to MS SQL Server and SQL Anywhere, for the same reasons. -Josh Berkus
В списке pgsql-performance по дате отправления: