On 2011-08-16, adam_pgsql <adam_pgsql@witneyweb.org> wrote:
>
> Hi,
>
> I have a query hitting a table of 25 million rows. The table has a
>text field ('identifier') which i need to query for matching rows. The
>question is if i have multiple strings to match against this field I
>can use multiple OR sub-statements or multiple statements in a UNION.
>The UNION seems to run quicker.... is this to be expected? or is there
>anything else I can do improve the speed of this query? Some query
>details:
> WHERE
> ( lower(identifier) LIKE lower('BUGS0000001884677') OR
> lower(identifier) LIKE lower('BUGS0000001884678') OR
> lower(identifier) LIKE lower('BUGS0000001884679') OR
> lower(identifier) LIKE lower('SpTIGR4-2210 (6F24)') )
> ORDER BY a.identifier;
don't use like use regex.
...
WHERE identifier ~* E'^(BUGS0000001884677|BUGS0000001884678|BUGS0000001884679|SpTIGR4-2210
\\(6F24\\))$'
or
where lower(identifier) ~* lower(E'^(BUGS0000001884677|BUGS0000001884678|BUGS0000001884679|SpTIGR4-2210
\\(6F24\\))$')
on the other hand you aren't doing any pattern stuff. - you you could
just use 'in':
WHERE lower(identifier) in (lower('BUGS0000001884677'), lower('BUGS0000001884678'), lower('BUGS0000001884679'),
lower('SpTIGR4-2210(6F24)') )
or if you need like, use like any:
WHERE lower(identifier) like ANY (lower('BUGS0000001884677'), lower('BUGS0000001884678'), lower('BUGS0000001884679'),
lower('SpTIGR4-2210(6F24)') )
> Also which should scale better if I add more strings to match? would there be any better design patterns for this
problem?
use one of the above: preferably in, else regex, or failing that like any.
"= any" will also work but I don't thing it will ever be better than "in"
--
⚂⚃ 100% natural