Re: Mailing list search engine: surprising missing results?
От | James Addison |
---|---|
Тема | Re: Mailing list search engine: surprising missing results? |
Дата | |
Msg-id | CALDQ5NwjHE6jjmxVPSq00FbTiVVKcb9+fX7nMnrRXtHNZGt+2g@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Mailing list search engine: surprising missing results? (Ivan Panchenko <i.panchenko@postgrespro.ru>) |
Список | pgsql-www |
On Tue, 25 Jan 2022 at 21:23, Ivan Panchenko <i.panchenko@postgrespro.ru> wrote: > > On 25.01.2022 23:48, James Addison wrote: > > I'm uncertain why parsing hyphenated query text produces compound tokens? > > Because in some cases user wants to search the full hyphenated words, > not parts of them. That makes sense, although to refer back to a previous suggestion of yours, we could allow matching on the full hyphenated words by emitting an 'OR' condition from the parsed query, instead of 'AND' (perhaps using an argument?). In other words: # expected query to achieve a match (from your previous post in this thread) 'boyers-moore' | ('boyers' & 'moore') # actual query that does not result in a match today (plainto_tsquery for 'boyer-moore') 'boyer-moore' & 'boyer' & 'moore' > >> It seems to me that in both cases we'd be better off generating > >> "'boyers' <-> 'moore'", without the compound token at all. > >> Maybe there's a case for the weaker 'boyers' & 'moore' translation, > >> but I think if people wanted that they'd just enter separate words. > > Matching the compond token might be significant for ranking. (?) Yes that does seem likely. The knowledge that there is an exact-match token in the results could be important for various use cases (including relevance scoring). > Probably, there is no universal *to_tsquery function and no universal > parser to fit all users. That seems possible too, yep.
В списке pgsql-www по дате отправления: