Re: BUG #15689: Stemming of negation/not operator
От | Tom Lane |
---|---|
Тема | Re: BUG #15689: Stemming of negation/not operator |
Дата | |
Msg-id | 16223.1552430042@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | BUG #15689: Stemming of negation/not operator (PG Bug reporting form <noreply@postgresql.org>) |
Ответы |
Re: BUG #15689: Stemming of negation/not operator
|
Список | pgsql-bugs |
PG Bug reporting form <noreply@postgresql.org> writes: > When using to_tsquery function it is stemming negation/not parts of the > query, where it probably shouldn't. > Some examples: > SELECT to_tsquery('english', 'car & !cars'); > to_tsquery > ---------------- > 'car' & !'car' I'm not exactly convinced by this argument, because it seems like you're only thinking about a corner case. There are probably at least as many examples where you *do* want stemming on a negated term. Another issue is that even if we changed the tsquery input function to not stem particular words, I doubt that it would do anything useful, because what it will be comparing to is tsvector entries that have certainly been stemmed. That is, even if the original document said "cars", what's going to be in the tsvector is just "car", so that forbidding a match to "cars" isn't going to do anything. (Maybe what this says is that there should be a less-lossy recheck against the original document after the tsvector match, but that'd have to be done by an additional, explicit operator I think. Or possibly the recheck just requires tsquery match with a different stemming configuration.) A related problem that's bothered me for some time is that lexemes get stemmed even if there is a "*" (prefix match) marker on them, causing them to possibly match much more than the user expected. But again, it's not real obvious how to make that better given the match-to-tsvector context --- not stemming could easily remove desired matches to stemmed tsvector entries. If we could think of a way for it to do something useful, my inclination would be to allow an explicit "don't stem" marker on lexemes, rather than trying to drive it off whether the context is a negation or not. regards, tom lane
В списке pgsql-bugs по дате отправления: