questions about tsearch2 (for czech language)
От | Pavel Stehule |
---|---|
Тема | questions about tsearch2 (for czech language) |
Дата | |
Msg-id | Pine.LNX.4.44.0312221128350.27697-100000@kix.fsv.cvut.cz обсуждение исходный текст |
Ответы |
Re: questions about tsearch2 (for czech language)
|
Список | pgsql-general |
Hello I try tsearch2 within czech environment. It is works fine, but I have two questions. 1. I have words "se", "ve" in my czech stop words. But I get this words in result. Why? Have I problem with my configuration? tsearch2=# select * from ts_debug('jmenuji se Pavel Stěhule a bydlím ve Skalici.'); ts_name | tok_type | description | token | dict_name | tsvector ---------------+----------+-------------+---------+-------------+----------- default_czech | lword | Latin word | jmenuji | {cz_ispell} | 'jmenuji' default_czech | lword | Latin word | se | {cz_ispell} | 'se' default_czech | lword | Latin word | Pavel | {cz_ispell} | 'pavel' default_czech | word | Word | Stěhule | {cz_ispell} | default_czech | lword | Latin word | a | {cz_ispell} | default_czech | word | Word | bydlím | {cz_ispell} | 'bydlet' default_czech | lword | Latin word | ve | {cz_ispell} | 've' default_czech | lword | Latin word | Skalici | {cz_ispell} | 'skalici' (8 řádek) tsearch2=# select * from pg_ts_cfgmap where ts_name='default_czech'; ts_name | tok_alias | dict_name ---------------+--------------+------------- default_czech | email | {simple} default_czech | file | {simple} default_czech | float | {simple} default_czech | host | {simple} default_czech | hword | {cz_ispell} default_czech | int | {simple} default_czech | lhword | {cz_ispell} default_czech | lpart_hword | {cz_ispell} default_czech | lword | {cz_ispell} default_czech | nlhword | {cz_ispell} default_czech | nlpart_hword | {cz_ispell} default_czech | nlword | {cz_ispell} default_czech | part_hword | {simple} default_czech | sfloat | {simple} default_czech | uint | {simple} default_czech | uri | {simple} default_czech | url | {simple} default_czech | version | {simple} default_czech | word | {cz_ispell} (19 řádek) 2. I use small czech dictionary. I need don't erase words which aren't in dictionary (in my sample Stěhule). Can I set it somewhere? I tryed add simple dict into cfg map, but witout sucess tsearch2=# select * from ts_debug('jmenuji se Pavel Stěhule a bydlím ve Skalici.'); ts_name | tok_type | description | token | dict_name | tsvector ---------------+----------+-------------+---------+--------------------+----------- default_czech | word | Word | Stěhule | {cz_ispell,simple} | default_czech | lword | Latin word | a | {cz_ispell,simple} | default_czech | word | Word | bydlím | {cz_ispell,simple} | 'bydlet' Thank You Pavel Stehule
В списке pgsql-general по дате отправления: