Re: Full text search prefix matching
От | Tom Lane |
---|---|
Тема | Re: Full text search prefix matching |
Дата | |
Msg-id | 1928.1418743816@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Full text search prefix matching (Heikki Rauhala <heikki.rauhala@reaktor.fi>) |
Список | pgsql-general |
Heikki Rauhala <heikki.rauhala@reaktor.fi> writes: > Should text search prefixes work predicatably as documented in [1] even if the lexemes are shorter than the query? Howcan I get it to work? I believe what you're seeing can be explained by these observations: regression=# select to_tsvector('finnish', 'sofia'); to_tsvector ------------- 'sof':1 (1 row) regression=# select to_tsquery('finnish','sofia:*'); to_tsquery ------------ 'sof':* (1 row) regression=# select to_tsquery('finnish','sofi:*'); to_tsquery ------------ 'sofi':* (1 row) regression=# select to_tsquery('finnish','sof:*'); to_tsquery ------------ 'sof':* (1 row) What this shows is that the finnish configuration includes a word-stemming rule that strips off "ia". It won't strip off just "i" though, so "sofi" doesn't get reduced to the same root and therefore doesn't match "sofia". The "*" addition does nothing for you here since it allows matching in the other direction (query shorter than target). I know nothing of Finnish so I can't say just how correct these particular stemming rules are for that language; perhaps they need adjustment. But it seems to me that if you want blind non-language-aware prefix matching, you probably don't want the full-text-search machinery at all. Full text search is meant to deal with words, both in the documents and the queries. You might take a look at pg_trgm as an alternative. regards, tom lane
В списке pgsql-general по дате отправления: