Re: How to switch off Snowball stemmer for tsearch2?
От | Oleg Bartunov |
---|---|
Тема | Re: How to switch off Snowball stemmer for tsearch2? |
Дата | |
Msg-id | Pine.LNX.4.64.0708231556590.2727@sn.sai.msu.ru обсуждение исходный текст |
Ответ на | Re: How to switch off Snowball stemmer for tsearch2? ("Dmitry Koterov" <dmitry@koterov.ru>) |
Ответы |
Re: How to switch off Snowball stemmer for tsearch2?
|
Список | pgsql-general |
On Thu, 23 Aug 2007, Dmitry Koterov wrote: >> >>> Now >>> >>> select lexize('ru_ispell_cp1251', 'Дмитриев') -> "Дмитрий" >>> select lexize('ru_ispell_cp1251', 'Иванов') -> "Иван" >>> - it is completely wrong! >>> >>> I have a database with all Russian name, is it possible to use it (how?) >> to >> >> if you have such database why just don't write special dictionary and >> put it in front ? > > > Of course because this is a database of Russian NAMES, but NOT a database of > surnames. > > >> make lexize() not to convert "Ivanov" to "Ivan" even if the ispell >>> dicrionary contains an element for "Ivan"? So, this pseudo-code logic is >>> needed: >>> >>> function new_lexize($string) { >>> $stem = lexize('ru_ispell_cp1251', $string); >>> if ($stem in names_database) return $string; else return $stem; >>> } >>> >>> Maybe tsearch2 implements this logic already? write your own dictionary, which implements any logic you need. In your case it's just a wrapper around ispell, which will returns original string not stem. See example http://www.sai.msu.su/~megera/postgres/fts/doc/fts-intdict-xmp.html and russian article http://www.sai.msu.su/~megera/postgres/talks/fts_pgsql_intro.html#ftsdict >> >> sure, it's how text search mapping works. > > > Could you please detalize? you create dictionary surnames_dict and configure pg_ts_cfgmap to process token of type nlword by surnames_dict, ru_ispell, ru_stem, for example. > > Of course I can create all word-forms of all Russian names using ispell and > then - subtract this full list from Ispell dictionary (so I will remove > "Ivan", "Ivanami" etc. from it). But possily tsearch2 has this subtraction > algorythm already. > don't do that ! Just go plain way. Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
В списке pgsql-general по дате отправления: