Re: phrase search
От | Teodor Sigaev |
---|---|
Тема | Re: phrase search |
Дата | |
Msg-id | 48441414.6060802@sigaev.ru обсуждение исходный текст |
Ответ на | phrase search (Sushant Sinha <sushant354@gmail.com>) |
Ответы |
Re: phrase search
|
Список | pgsql-hackers |
> I have attached a patch for phrase search with respect to the cvs head. > Basically it takes a a phrase (text) and a TSVector. It checks if the > relative positions of lexeme in the phrase are same as in their > positions in TSVector. Ideally, phrase search should be implemented as new operator in tsquery, say # with optional distance. So, tsquery 'foo #2 bar' means: find all texts where 'bar' is place no far than two word from 'foo'. The complexity is about complex boolean expressions ( 'foo #1 ( bar1 & bar2 )' ) and about several languages as norwegian or german. German language has combining words, like a footboolbar - and they have several variants of splitting,so result of to_tsquery('foo # footboolbar') will be a 'foo # ( ( football & bar ) | ( foot & ball & bar ) )' where variants are connected with OR operation. Of course, phrase search should be able to use indexes. > > If the configuration for text search is "simple", then this will produce > exact phrase search. Otherwise the stopwords in a phrase will be ignored > and the words in a phrase will only be matched with the stemmed lexeme. Your solution can't be used as is, because user should use tsquery too to use an index: column @@ to_tsquery('phrase search') AND is_phrase_present('phrase search', column) First clause will be used for index scan and it will fast search a candidates. > For my application I am using this as a separate shared object. I do not > know how to expose this function from the core. Can someone explain how > to do this? Look at contrib/ directory in pgsql's source code - make a contrib module from your patch. As an example, look at adminpack module - it's rather simple. Comments of your code: 1) +#ifdef PG_MODULE_MAGIC +PG_MODULE_MAGIC; +#endif That isn't needed for compiled-in in core files, it's only needed for modules. 2) use only /**/ comments, do not use a // (C++ style) comments -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
В списке pgsql-hackers по дате отправления: