Re: Mailing list search engine: surprising missing results?
От | Laurenz Albe |
---|---|
Тема | Re: Mailing list search engine: surprising missing results? |
Дата | |
Msg-id | 22d5245c9c5a9aa05a0510bdd52458812140a870.camel@cybertec.at обсуждение исходный текст |
Ответ на | Re: Mailing list search engine: surprising missing results? (Oleg Bartunov <obartunov@postgrespro.ru>) |
Ответы |
Re: Mailing list search engine: surprising missing results?
|
Список | pgsql-www |
On Tue, 2022-01-25 at 14:04 +0300, Oleg Bartunov wrote: > On Mon, Jan 24, 2022 at 11:47 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Bruce Momjian <bruce@momjian.us> writes: > > > On Mon, Jan 24, 2022 at 08:27:41AM +0100, Laurenz Albe wrote: > > > > The reason is that the 'moore' in 'boyer-moore' is stemmed, since it > > > > is at the end of the word, while the 'moore' in 'Boyer-Moore-Horspool' > > > > isn't: > > > > > Wow, he showed me this problem earlier but I never suspected it was > > > stemming issue because I never considered proper nowns could be > > > stem-adjusted, but it is obvious they can. > > > > I wonder if we should change that so that components of a compound > > word are consistently stemmed the same way. > > Something like this > > SELECT to_tsvector('english', 'Boyer-Moore-Horspool'); > to_tsvector > ---------------------------------------------------------- > 'boyer':2 'boyer-moore-horspool':1 'boyer-moore':1 'moore-horspool':1 'horspool':4 'moor':3 > (1 row) Not quite. The problem is question is the "'boyer-moore':1". If that were "'boyer-moor':1" instead, the problem would disappear. Yours, Laurenz Albe
В списке pgsql-www по дате отправления: