Re: Mailing list search engine: surprising missing results?
От | Laurenz Albe |
---|---|
Тема | Re: Mailing list search engine: surprising missing results? |
Дата | |
Msg-id | ab4184b7ab84623be10c4676e090cc27ae78b355.camel@cybertec.at обсуждение исходный текст |
Ответ на | Mailing list search engine: surprising missing results? (James Addison <jay@jp-hosting.net>) |
Ответы |
Re: Mailing list search engine: surprising missing results?
|
Список | pgsql-www |
On Sun, 2022-01-23 at 12:49 +0000, James Addison wrote: > Hello, > > I noticed that the mailing list search engine[1] seems to unexpectedly > miss results for some queries. > > For example: > > A search for "boyer"[2] returns five results, including result > snippets that contain the text "Boyer-More-Horspool" [sic] and > "Boyer-Moore-Horspool". > > However, a more specific search for "boyer-moore"[3] does not return > any results -- that seems surprising. > > Specializing the query further and searching for > "boyer-moore-horspool"[4] *does* again return results -- two documents > -- with the terms "boyer" and "horspool" highlighted. This is caused by the peculiarities of PostgreSQL full text search: SELECT to_tsvector('english', 'Boyer-Moore-Horspool') @@ websearch_to_tsquery('english', 'boyer-moore'); ?column? ══════════ f (1 row) The reason is that the 'moore' in 'boyer-moore' is stemmed, since it is at the end of the word, while the 'moore' in 'Boyer-Moore-Horspool' isn't: SELECT to_tsvector('english', 'Boyer-Moore-Horspool'); to_tsvector ══════════════════════════════════════════════════════════ 'boyer':2 'boyer-moore-horspool':1 'horspool':4 'moor':3 (1 row) SELECT websearch_to_tsquery('english', 'boyer-moore'); websearch_to_tsquery ═════════════════════════════════════ 'boyer-moor' <-> 'boyer' <-> 'moor' (1 row) 'boyer-moor' is not present in the first result. As a workaround, I suggest that you search for 'boyer moore' or (even better) '"boyer moore"' (with the double quotes): SELECT websearch_to_tsquery('english', 'boyer moore'); websearch_to_tsquery ══════════════════════ 'boyer' & 'moor' (1 row) SELECT websearch_to_tsquery('english', '"boyer moore"'); websearch_to_tsquery ══════════════════════ 'boyer' <-> 'moor' (1 row) Yours, Laurenz Albe
В списке pgsql-www по дате отправления: