Re: FTS with more than one language in body and with unknown query language?
| От | Artur Zakirov |
|---|---|
| Тема | Re: FTS with more than one language in body and with unknown query language? |
| Дата | |
| Msg-id | aeba75ce-9c59-082e-edee-90f59f455f42@postgrespro.ru обсуждение исходный текст |
| Ответ на | Re: FTS with more than one language in body and with unknown query language? (Stefan Keller <sfkeller@gmail.com>) |
| Список | pgsql-general |
On 15.07.2016 21:34, Stefan Keller wrote: > I actually expect that stemming takes place for english and german. > And we will in fact have queries in english and in german as well. > So I think we still have some issues to resolve...? I performed the following things: - patch for PostgreSQL: https://github.com/select-artur/postgres/tree/join_tsconfig It adds new option for FTS dictionary mapping (JOIN). I want to propose this patch to -hackers. - dict_translate dictionary based on dict_xsyn contrib: https://github.com/select-artur/dict_translate This things are made for multilingual purpose and are interesting for us. Maybe they will be helpful for you too. Example: 1 - Create files: $SHAREDIR/tsearch_data/geo_en.trn: forest wald forst holz $SHAREDIR/tsearch_data/geo_de.trn: wald forest wood forst forest wood holz forest wood 2 - Execute queries: =# CREATE TEXT SEARCH DICTIONARY geo_en ( Template = translate, DictFile = geo_en, InputDict = pg_catalog.english_stem); =# CREATE TEXT SEARCH DICTIONARY geo_de ( Template = translate, DictFile = geo_de, InputDict = pg_catalog.german_stem); =# CREATE TEXT SEARCH CONFIGURATION geo(COPY='simple'); =# ALTER TEXT SEARCH CONFIGURATION geo_ths ALTER MAPPING FOR asciiword, asciihword, hword_asciipart, word, hword, hword_part WITH geo_en (JOIN), english_stem (JOIN), geo_de (JOIN), german_stem (JOIN); =# CREATE TABLE geo (id int, body_en text, body_de text); =# INSERT INTO geo VALUES (1, 'forest', NULL), (2, NULL, 'wald'); 3 - Sample queries: =# SELECT * FROM geo WHERE to_tsvector('geo', body_en) @@ to_tsquery('geo', 'forests'); id | body_en | body_de ----+---------+--------- 1 | forest | (null) (1 row) =# SELECT * FROM geo WHERE to_tsvector('geo', body_de) @@ to_tsquery('geo', 'forests'); id | body_en | body_de ----+---------+--------- 2 | (null) | wald 3 | (null) | forst (2 rows) =# SELECT * FROM geo WHERE to_tsvector('geo', body_en) @@ to_tsquery('geo', 'walde'); id | body_en | body_de ----+---------+--------- 1 | forest | (null) (1 row) I will be glad for your comments. -- Artur Zakirov Postgres Professional: http://www.postgrespro.com Russian Postgres Company
В списке pgsql-general по дате отправления: