Re: TSearch2 / German compound words / UTF-8
От | Alexander Presber |
---|---|
Тема | Re: TSearch2 / German compound words / UTF-8 |
Дата | |
Msg-id | 6AC64576-AEB6-47C0-AA8C-0242F9296BEA@weisshuhn.de обсуждение исходный текст |
Ответ на | TSearch2 / German compound words / UTF-8 (Hannes Dorbath <light@theendofthetunnel.de>) |
Ответы |
Re: TSearch2 / German compound words / UTF-8
Re: TSearch2 / German compound words / UTF-8 |
Список | pgsql-general |
>> Tsearch/isepll is not able to break this word into parts, because >> of the "s" in "Produktion/s/intervall". Misspelling the word as >> "Produktionintervall" fixes it: > It should be affixes marked as 'affix in middle of compound word', > Flag is '~', example look in norsk dictionary: > > flag ~\\: > [^S] > S #~ advarsel > advarsels- > > BTW, we develop and debug compound word support on norsk > (norwegian) dictionary, so look for example there. But we don't > know Norwegian, norwegians helped us :) Hello everyone! I cannot get this to work. Neither in a german version, nor with the norwegian example supplied on the tsearch website. That means, just like Hannes I can get compound word support without inserted 's' in german and norwegian: "Vertragstrafe" works, but not "Vertragsstrafe", which is the right Form. So I tried it the other way around: My dictionary consists of two words: --- vertrag/zs strafe/z --- My affixes file just switches on compounds and allows for s-insertion as described in the norwegian tutorial: --- compoundwords controlled z suffixes flag s: [^S] > S # endet nicht auf "s": "s" anfuegen und in compound-check ("Recht" > "Rechts-") --- ts_debug yields: tstest=# SELECT tsearch2.ts_debug('vertragstrafe strafevertrag vertragsstrafe'); ts_debug ------------------------------------------------------------------------ ------------- (german,lword,"Latin word",vertragstrafe,"{ispell_de,simple}","'strafe' 'vertrag'") (german,lword,"Latin word",strafevertrag,"{ispell_de,simple}","'strafe' 'vertrag'") (german,lword,"Latin word",vertragsstrafe,"{ispell_de,simple}",'vertragsstrafe') (3 Zeilen) I would say, the ispell compound support does not honor the s-Flag in compounds. Could it be, that this feature got lost in a regression? It must have worked for norwegian once. (Take the "overtrekksgrilldresser" example from the tsearch2:compounds tutorial, that I cannot reproduce). Any hints? Alexander
В списке pgsql-general по дате отправления: