Re: BUG #13766: weird ts_headline/ts_vector/ts_query behaviour
От | Artur Zakirov |
---|---|
Тема | Re: BUG #13766: weird ts_headline/ts_vector/ts_query behaviour |
Дата | |
Msg-id | 564C35CB.9020800@postgrespro.ru обсуждение исходный текст |
Ответ на | BUG #13766: weird ts_headline/ts_vector/ts_query behaviour (aslesha.akella@gmail.com) |
Список | pgsql-bugs |
On 10.11.2015 16:53, aslesha.akella@gmail.com wrote: > > We are trying to make text search for a word "goede" and "goed". It got the > following results with different languages. > Hi Do you use predefined text search configurations "english" and "dutch"? If so and you do not change them, then "english" and "dutch" configurations use English and Dutch stemming algorithms (https://en.wikipedia.org/wiki/Stemming) and check for stop words. In the following examples you can see how words are converted to lexems: > select to_tsvector('dutch', 'Goede vrijdag'); to_tsvector ---------------------- 'goed':1 'vrijdag':2 (1 row) > select to_tsvector('dutch', 'Goed vrijdag'); to_tsvector ---------------------- 'goed':1 'vrijdag':2 (1 row) > select to_tsvector('english', 'Goed vrijdag'); to_tsvector -------------------- 'go':1 'vrijdag':2 (1 row) > select to_tsvector('english', 'Goede vrijdag'); to_tsvector ---------------------- 'goed':1 'vrijdag':2 (1 row) The simple configuration do not use stemming algorithms. It only convert input words to lower case lexems and exclude stop words. You also can create ispell dictionary and use it. More information in the documentation: http://www.postgresql.org/docs/devel/static/textsearch-dictionaries.html and good articles: http://shisaa.jp/postset/postgresql-full-text-search-part-1.html http://shisaa.jp/postset/postgresql-full-text-search-part-2.html http://shisaa.jp/postset/postgresql-full-text-search-part-3.html But I am not sure that I understood your question correctly. -- Artur Zakirov Postgres Professional: http://www.postgrespro.com Russian Postgres Company
В списке pgsql-bugs по дате отправления: