Re: Full text search on partial URLs

Поиск
Список
Период
Сортировка
От Zev Benjamin
Тема Re: Full text search on partial URLs
Дата
Msg-id 527A900F.6020103@strangersgate.com
обсуждение исходный текст
Ответ на Re: Full text search on partial URLs  (bricklen <bricklen@gmail.com>)
Ответы Re: Full text search on partial URLs  (bricklen <bricklen@gmail.com>)
Список pgsql-general
On 11/06/2013 01:47 PM, bricklen wrote:
>
> On Wed, Nov 6, 2013 at 10:45 AM, Zev Benjamin
> <zev-pgsql@strangersgate.com <mailto:zev-pgsql@strangersgate.com>> wrote:
>
>     Hi,
>
>     I have Postgres full text search set up for my application and it's
>     been working great!  However, my users would like their searches to
>     turn up parts of URLs.  For example, they would like a search for
>     "foobar" to turn up a document that contains the string
>     "http://example.com/foobar/__blah <http://example.com/foobar/blah>"
>     (and similarly for queries like "example" and "blah).  With the
>     default dictionaries for host, url, and url_path, the search query
>     would have to contain the complete host or url path.
>
>     What is the best way to accomplish this?  Should I be looking at
>     building a custom dictionary that breaks down hosts and urls or is
>     there something simpler I can do?
>
>
> Have you looked into trigrams?
> http://www.postgresql.org/docs/current/static/pgtrgm.html

I've looked at it in the context of adding fuzzy search.  But my
understanding is that doing a fuzzy search here would only work if the
query were a significant fraction of, say, the url path.  For example, I
would expect a fuzzy search of "foobar" on "/foobar/x" to return a high
similarity, but a fuzzy search of "foobar" on
"/foobar/some/very/long/path/x" to have a low similarity.

Or are you suggesting using trigrams in a different way?


Zev


В списке pgsql-general по дате отправления:

Предыдущее
От: bricklen
Дата:
Сообщение: Re: Full text search on partial URLs
Следующее
От: bricklen
Дата:
Сообщение: Re: Full text search on partial URLs