Re: Fastest Index/Algorithm to find similar sentences
От | Beena Emerson |
---|---|
Тема | Re: Fastest Index/Algorithm to find similar sentences |
Дата | |
Msg-id | CAOG9ApEaGjHaFtm2XrVGYc6WbYFva3JzLxa6ANSFFyW_-mFkQA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Fastest Index/Algorithm to find similar sentences (Beena Emerson <memissemerson@gmail.com>) |
Список | pgsql-general |
I am sorry, I just re-read your mail and realized you have already tried with pg_trgm.
On Wed, Jul 31, 2013 at 7:23 PM, Beena Emerson <memissemerson@gmail.com> wrote:
On Sat, Jul 27, 2013 at 10:34 PM, Janek Sendrowski <janek12@web.de> wrote:Hi Sergey Konoplev,
If I'm searching for a sentence like "The tiger is the largest cat species" for example.
I can only find the sentences, which include the words "tiger, largest, cat, species", but I also like to have the sentences with only three or even two of these words.
Janek
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-generalHi,You may use similarity functions of pg_trgm.Example:=# \d+ testTable "public.test"Column | Type | Modifiers | Storage | Stats target | Description--------+------+-----------+----------+--------------+-------------col | text | | extended | |Indexes:"test_idx" gin (col gin_trgm_ops)Has OIDs: no# SELECT * FROM test;col-----------------------------------------The tiger is the largest cat speciesThe cheetah is the fastest cat speciesThe peacock is the largest bird species(3 rows)=# SELECT show_limit();show_limit------------0.3(1 row)=# SELECT col, similarity(col, 'The tiger is the largest cat species') AS smlFROM test WHERE col % 'The tiger is the largest cat species'ORDER BY sml DESC, col;col | sml-----------------------------------------+----------The tiger is the largest cat species | 1The peacock is the largest bird species | 0.511111The cheetah is the fastest cat species | 0.466667(3 rows)=# SELECT set_limit(0.5);set_limit-----------0.5(1 row)=# SELECT col, similarity(col, 'The tiger is the largest cat species') AS smlFROM test WHERE col % 'The tiger is the largest cat species'ORDER BY sml DESC, col;col | sml-----------------------------------------+----------The tiger is the largest cat species | 1The peacock is the largest bird species | 0.511111(2 rows)=# SELECT set_limit(0.9);set_limit-----------0.9(1 row)=# SELECT col, similarity(col, 'The tiger is the largest cat species') AS smlFROM test WHERE col % 'The tiger is the largest cat species'ORDER BY sml DESC, col;col | sml--------------------------------------+-----The tiger is the largest cat species | 1(1 row)When you set a higher limit, you get more exact matches.--Beena Emerson
Beena Emerson
В списке pgsql-general по дате отправления: