Re: efficiency of wildcards at both ends

Поиск
Список
Период
Сортировка
От Edson Richter
Тема Re: efficiency of wildcards at both ends
Дата
Msg-id BLU0-SMTP226684D569C127F2CFC0B99CFFE0@phx.gbl
обсуждение исходный текст
Ответ на Re: efficiency of wildcards at both ends  (Edson Richter <edsonrichter@hotmail.com>)
Ответы Re: efficiency of wildcards at both ends  (John R Pierce <pierce@hogranch.com>)
Список pgsql-general
Just ocurred to me that would be possible to create some sort of "hybrid" solution...

create index idx1 on tb1 (nome);
create index idx2 on tb1 (reverse(nome));

select * from tb1
  where nome like 'CARLOS%' or reverse(nome) like reverse('%CARLOS')

Should return same results as
select * from tb1
  where nome like '%CARLOS%'

I supposed that this hybrid solution will be optmized by the indexes (but at what cost?).

Can some PostgreSQL expert tell if this assumption is right? Would be possible to have PostgreSQL doing that automatically? Something like

create index idx1 on tb1 (nome) with options (optimize wildcards);
select * from tb1 where nome like '%CARLOS%';

and then this get expanded as the example above? Then, what happens with the following query:

select * from tb1 where nome like '%CARLOS%ERICKSSON%';

?

Edson

Em 20/06/2012 14:28, Edson Richter escreveu:
AFAIK, wildcards at both ends are not optimized at all, unless you use some sort of specialized index (may be Gist or FullText).
Until 9.1 there is no such "Index Scan" feature, that would help (very little).
Other databases (like MS SQL Server) solve this kind of query by executing an Index Scan, then merge join with rest of the query.

This is all I know about LIKE optimization in PostgreSQL:

LIKE 'str%' -> optimized by normal indexes
LIKE '%str%' -> not optimized. You can use FullText, but then your wildcards will have to change to something not SQL-standard compatible solution...
LIKE '%str' -> can be optimized if you create index with column content reversed then query reversed as well. See code below for details.

How did I optimized "%str" queries (code implemented with help from the PgSql community):

CREATE OR REPLACE FUNCTION reverse(input character varying)
  RETURNS character varying AS
$BODY$
DECLARE
  result character varying = '';
  i int;
BEGIN
  FOR i IN 1..length(input) BY 2 LOOP
    result = substr(input,i+1,1) || substr(input,i,1) || result;
  END LOOP;
  RETURN result;
END$BODY$
  LANGUAGE plpgsql IMMUTABLE STRICT
  COST 100;
create index idx on tb1 (reverse(nome));
select * from tb1 where reverse(nome) like reverse('%RICHTER');


Regards,

Edson.

Em 20/06/2012 14:10, Sam Z J escreveu:
Hi all

I'm curious how is wildcards at both ends implemented, e.g. LIKE '%str%'
How efficient is it if that's the only search criteria against a large table? how much does indexing the column help and roughly how much more space is needed for the index?

if the answers are too long, please point me to the relavant text =D 

thanks

--
Zhongshi (Sam) Jiang
sammyjiang721@gmail.com




В списке pgsql-general по дате отправления:

Предыдущее
От: Edson Richter
Дата:
Сообщение: Re: efficiency of wildcards at both ends
Следующее
От: Alan Hodgson
Дата:
Сообщение: Re: efficiency of wildcards at both ends