Re: which is better: using OR clauses or UNION?
От | Tom Lane |
---|---|
Тема | Re: which is better: using OR clauses or UNION? |
Дата | |
Msg-id | 26139.1313503748@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | which is better: using OR clauses or UNION? (adam_pgsql <adam_pgsql@witneyweb.org>) |
Ответы |
Re: which is better: using OR clauses or UNION?
|
Список | pgsql-sql |
adam_pgsql <adam_pgsql@witneyweb.org> writes: > I have a query hitting a table of 25 million rows. The table has a > text field ('identifier') which i need to query for matching rows. The > question is if i have multiple strings to match against this field I > can use multiple OR sub-statements or multiple statements in a > UNION. The UNION seems to run quicker.... is this to be expected? Your test cases don't seem exactly comparable; in particular I think the second one is benefiting from the first one having already read and cached the relevant disk blocks. Notice how you've got, eg, > -> Bitmap Index Scan on in_dba_data_base_identifier (cost=0.00..32.64 rows=964 width=0) (actual time=71.347..71.347rows=318 loops=1) > Index Cond: (lower(identifier) ~=~ 'sptigr4-2210 (6f24)'::character varying) versus > -> Bitmap Index Scan on in_dba_data_base_identifier (cost=0.00..32.64 rows=964 width=0) (actualtime=0.178..0.178 rows=318 loops=1) > Index Cond: (lower(identifier) ~=~ 'sptigr4-2210 (6f24)'::character varying) Those are the exact same subplan, so any honest comparison should be finding them to take the same amount of time. When the actual readings are different by a factor of several hundred, there's something wrong with your measurement process. In the end this comes down to whether duplicates will be eliminated more efficiently by a BitmapOr step or by sort/uniq on the resulting rows. I'd have to bet on the BitmapOr myself, but it's likely that this is down in the noise compared to the actual disk accesses in any not-fully-cached scenario. Also, if you don't expect the sub-statements to yield any duplicates, or don't care about seeing the same row twice in the output, you should consider UNION ALL instead of UNION. regards, tom lane
В списке pgsql-sql по дате отправления: