Обсуждение: WHERE clause OR vs IN
Hi,
I know this is not exactly admin related, but ... it is simple enough to be even fun
From a performance point of view, is it better to use OR as in
SELECT expr FROM expr WHERE col=this OR col=that OR col=theOther
Or to use a range of values as in
SELECT expr FROM expr WHERE col in ( val1, val2, ...., valn)
I think the IN range yields a better query plan .... what do you think ?
Thanks
Medi
I know this is not exactly admin related, but ... it is simple enough to be even fun
From a performance point of view, is it better to use OR as in
SELECT expr FROM expr WHERE col=this OR col=that OR col=theOther
Or to use a range of values as in
SELECT expr FROM expr WHERE col in ( val1, val2, ...., valn)
I think the IN range yields a better query plan .... what do you think ?
Thanks
Medi
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Wed, 12 Dec 2007 14:25:16 -0800 "Medi Montaseri" <montaseri@gmail.com> wrote: > Hi, > > I know this is not exactly admin related, but ... it is simple enough > to be even fun > > From a performance point of view, is it better to use OR as in > SELECT expr FROM expr WHERE col=this OR col=that OR col=theOther > Or to use a range of values as in > SELECT expr FROM expr WHERE col in ( val1, val2, ...., valn) > > I think the IN range yields a better query plan .... what do you > think ? Thanks Well a simple test: postgres=# explain analyze select * from tellers where bid in ('1','2'); QUERY PLAN - ------------------------------------------------------------------------------------------------------ Seq Scan on tellers (cost=0.00..10.25 rows=20 width=352) (actual time=0.019..0.192 rows=20 loops=1) Filter: (bid = ANY ('{1,2}'::integer[])) Total runtime: 0.314 ms (3 rows) postgres=# explain analyze select * from tellers where bid = '1' or bid = '2'; QUERY PLAN - ------------------------------------------------------------------------------------------------------ Seq Scan on tellers (cost=0.00..11.50 rows=20 width=352) (actual time=0.018..0.199 rows=20 loops=1) Filter: ((bid = 1) OR (bid = 2)) Total runtime: 0.275 ms (3 rows) postgres=# > Medi - -- The PostgreSQL Company: Since 1997, http://www.commandprompt.com/ Sales/Support: +1.503.667.4564 24x7/Emergency: +1.800.492.2240 Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate SELECT 'Training', 'Consulting' FROM vendor WHERE name = 'CMD' -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (GNU/Linux) iD8DBQFHYGJwATb/zqfZUUQRAs32AJ9HuJVcY5gcr0hboxkI6PcRtv++JwCfXd00 nQ7Frkof0mVwqNYVxQ9Vziw= =XzJi -----END PGP SIGNATURE-----
Thanks...is'nt the run time latency more visiable with lorge input sets (big tables) as well as how long the OR-ed expression chain is
based on your report, run time of OR is 0.275 ms and IN is 0.314
Perhaps if we run explain verbose to see the actual query plan
medi
based on your report, run time of OR is 0.275 ms and IN is 0.314
Perhaps if we run explain verbose to see the actual query plan
medi
On Dec 12, 2007 2:36 PM, Joshua D. Drake <jd@commandprompt.com> wrote:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1Well a simple test:
On Wed, 12 Dec 2007 14:25:16 -0800
"Medi Montaseri" <montaseri@gmail.com > wrote:
> Hi,
>
> I know this is not exactly admin related, but ... it is simple enough
> to be even fun
>
> From a performance point of view, is it better to use OR as in
> SELECT expr FROM expr WHERE col=this OR col=that OR col=theOther
> Or to use a range of values as in
> SELECT expr FROM expr WHERE col in ( val1, val2, ...., valn)
>
> I think the IN range yields a better query plan .... what do you
> think ? Thanks
postgres=# explain analyze select * from tellers where bid in ('1','2');
QUERY
PLAN
- ------------------------------------------------------------------------------------------------------
Seq Scan on tellers (cost=0.00..10.25 rows=20 width=352) (actual
time=0.019..0.192 rows=20 loops=1) Filter: (bid = ANY
('{1,2}'::integer[])) Total runtime: 0.314 ms (3 rows)
postgres=# explain analyze select * from tellers where bid = '1' or bid
= '2'; QUERY PLAN
- ------------------------------------------------------------------------------------------------------
Seq Scan on tellers (cost=0.00..11.50 rows=20 width=352) (actual
time=0.018..0.199 rows=20 loops=1) Filter: ((bid = 1) OR (bid = 2))
Total runtime: 0.275 ms
(3 rows)
postgres=#
> Medi
- --
The PostgreSQL Company: Since 1997, http://www.commandprompt.com/
Sales/Support: +1.503.667.4564 24x7/Emergency: +1.800.492.2240
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
SELECT 'Training', 'Consulting' FROM vendor WHERE name = 'CMD'
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
iD8DBQFHYGJwATb/zqfZUUQRAs32AJ9HuJVcY5gcr0hboxkI6PcRtv++JwCfXd00
nQ7Frkof0mVwqNYVxQ9Vziw=
=XzJi
-----END PGP SIGNATURE-----
--- On Wed, 12/12/07, Medi Montaseri <montaseri@gmail.com> wrote: > based on your report, run time of OR is 0.275 ms and IN is > 0.314 > > > postgres=# explain analyze select * from tellers where > bid in ('1','2'); Two other options are: SELECT * FROM Tellers WHERE bin = ANY( '1', '2' ); and SELECT T.* FROM Tellers AS T INNER JOIN ( VALUES ( '1' ), ( '2' ) ) AS B( bin ) ON T.bin = B.bin; Regards, Richard Broersma Jr.
Richard Broersma Jr <rabroersma@yahoo.com> writes: > --- On Wed, 12/12/07, Medi Montaseri <montaseri@gmail.com> wrote: >> based on your report, run time of OR is 0.275 ms and IN is >> 0.314 >> > postgres=# explain analyze select * from tellers where >> bid in ('1','2'); > Two other options are: > SELECT * > FROM Tellers > WHERE bin = ANY( '1', '2' ); Note that depending on which PG version you are testing, x IN (a,b,c) is exactly equivalent to x=a OR x=b OR x=c (older versions), or to x = ANY(ARRAYa,b,c]) (newer versions). > SELECT T.* > FROM Tellers AS T > INNER JOIN ( VALUES ( '1' ), ( '2' ) ) AS B( bin ) > ON T.bin = B.bin; I seriously doubt that one's gonna win ... regards, tom lane
--- On Wed, 12/12/07, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I seriously doubt that one's gonna win ... :-)
Thanks everyone, and I know I am taking too much bandwidth, but...
The reason I was thinking IN would work better is that from a parser point of view confronted with a series of expressions chained via boolean operators such as
expr OR expr OR expr
The back end code generated has to be generic to combat with different type of expression such as equality, greater than, less than, or any fancy expressions. For example:
a == b OR c != d AND e > f OR g == h
Where as in the case of
a IN (1, 2, 3, 4)
is actually a subset of the above composite expression because
1- the expressions are always equality of operand a with a literal value which means the operand can be cached (register)
2- the composite expression is always an OR chained expression where the first TRUE-ness would return the composite as TRUE (aka short circuit behavior)
I could be wrong...I have been wrong before...
Medi
The reason I was thinking IN would work better is that from a parser point of view confronted with a series of expressions chained via boolean operators such as
expr OR expr OR expr
The back end code generated has to be generic to combat with different type of expression such as equality, greater than, less than, or any fancy expressions. For example:
a == b OR c != d AND e > f OR g == h
Where as in the case of
a IN (1, 2, 3, 4)
is actually a subset of the above composite expression because
1- the expressions are always equality of operand a with a literal value which means the operand can be cached (register)
2- the composite expression is always an OR chained expression where the first TRUE-ness would return the composite as TRUE (aka short circuit behavior)
I could be wrong...I have been wrong before...
Medi
On Dec 12, 2007 5:03 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Richard Broersma Jr <rabroersma@yahoo.com> writes:Note that depending on which PG version you are testing, x IN (a,b,c)
> --- On Wed, 12/12/07, Medi Montaseri <montaseri@gmail.com > wrote:
>> based on your report, run time of OR is 0.275 ms and IN is
>> 0.314
>>
> postgres=# explain analyze select * from tellers where
>> bid in ('1','2');
> Two other options are:
> SELECT *
> FROM Tellers
> WHERE bin = ANY( '1', '2' );
is exactly equivalent to x=a OR x=b OR x=c (older versions), or to
x = ANY(ARRAYa,b,c]) (newer versions).I seriously doubt that one's gonna win ...
> SELECT T.*
> FROM Tellers AS T
> INNER JOIN ( VALUES ( '1' ), ( '2' ) ) AS B( bin )
> ON T.bin = B.bin;
regards, tom lane