Re: FTS with more than one language in body and with unknown query language?
| От | Stefan Keller |
|---|---|
| Тема | Re: FTS with more than one language in body and with unknown query language? |
| Дата | |
| Msg-id | CAFcOn29yEES4Y=E1c0Nj__8o1Kb_RB4Ey2NZtTYuy7w2DRcjag@mail.gmail.com обсуждение исходный текст |
| Ответ на | Re: FTS with more than one language in body and with unknown query language? (Artur Zakirov <a.zakirov@postgrespro.ru>) |
| Ответы |
Re: FTS with more than one language in body and with
unknown query language?
|
| Список | pgsql-general |
приве́т! Artur Thanks for your explanations. 2016-07-14 17:20 GMT+02:00 Artur Zakirov <a.zakirov@postgrespro.ru>: > On 14.07.2016 01:16, Stefan Keller wrote: ... >> * Should I create a synonym dictionary which contains word >> translations en-de instead of synonyms en-en? > > This synonym dictionary will contain a thousands entries. So it will require > a great effort to make this dictionary. It's a domain-specific corpus of max. 1000 records of descriptive text (metadata) about geographic data, like topographic map, land use planning, etc. ... >> * How to setup a text search configuration which e.g. stems en and de >> words? I still would like to give FTS a try with synonym dictionary (en-de). Now, I'm wondering how to setup the configuration. I've seen examples to process either english, german or russian alone. But I did not find yet any documentation on how to setup the text search configuration where a corpus contains two (or more) languages at same time in a table (body_en and body_de). :Stefan 2016-07-14 17:20 GMT+02:00 Artur Zakirov <a.zakirov@postgrespro.ru>: > Hi, > > On 14.07.2016 01:16, Stefan Keller wrote: >> >> Hi, >> >> I have a text corpus which contains either German or English docs and >> I expect queries where I don't know if it's German or English. So I'd >> like e.g. that a query "forest" matches "forest" in body_en but also >> "Wald" in body_de. >> >> I created a table with attributes body_en and body_de (type "text"). I >> will use ts_vector/ts_query on the fly (don't need yet an index >> (attributes)). >> >> * Can FTS handle this multilingual situation? > > > In my opinion, PostgreSQL cant handle it. It cant translate words from one > language to another, it just stems word from original form to basic form. > First you need to translate word from English to German, then search word in > the body_de attribute. > > And the issue is complicated by the fact that one word could have different > meaning in the other language. > >> * How to setup a text search configuration which e.g. stems en and de >> words? >> * Should I create a synonym dictionary which contains word >> translations en-de instead of synonyms en-en? > > > This synonym dictionary will contain a thousands entries. So it will require > a great effort to make this dictionary. > > >> * Any hints to related work where FTS has been used in a multilingual >> context? >> >> :Stefan >> >> > > -- > Artur Zakirov > Postgres Professional: http://www.postgrespro.com > Russian Postgres Company
В списке pgsql-general по дате отправления: