Hello
I am testing fulltext.
1. I am not able use fulltext with latin2 encoding :( I missing note
about only utf8 dictionaries in doc).
2. with hspell dictionaries (fresh copy from open office) I got
different and wrong results.
Original (old) result
ts=# select * from ts_debug('Příliš žluťoučký kůň se napil žluté vody'); ts_name | tok_type | description |
token | dict_name | tsvector--------------+----------+-------------+-----------+
-------------------+ ------------default_czech | word | Word | Příliš |
{cz_ispell,simple} | 'příliš'default_czech | word | Word | žluťoučký |
{cz_ispell,simple} | 'žluťoučký'default_czech | word | Word | kůň | {cz_ispell,simple} |
'kůň'default_czech| lword | Latin word | se | {cz_ispell,simple} |default_czech | lword | Latin word |
napil |
{cz_ispell,simple} | 'napít'default_czech | word | Word | žluté |
{cz_ispell,simple} | 'žlutý'default_czech | lword | Latin word | vody |
{cz_ispell,simple} | 'voda'(7 řádek)
New results:
postgres=# create Text search dictionary cspell(template=ispell,
dictfile=czech, afffile=czech, stopwords=czech);
CREATE TEXT SEARCH DICTIONARY
postgres=# CREATE text search configuration cs (copy=english);
CREATE TEXT SEARCH CONFIGURATION
postgres=# alter text search configuration cs alter mapping for word,
lword with cspell, simple;
ALTER TEXT SEARCH CONFIGURATION
postgres=# select * from ts_debug('cs','Příliš žluťoučký kůň se napil
žluté vody');Alias | Description | Token | Dictionaries | Lexized token
-------+---------------+-----------+-----------------+---------------------word | Word | Příliš |
{cspell,simple}| cspell: {příliš}blank | Space symbols | | {} |word | Word | žluťoučký
|{cspell,simple} | cspell: {žluťoučký}blank | Space symbols | | {} |word | Word | kůň
| {cspell,simple} | cspell: {kůň}blank | Space symbols | | {} |lword | Latin word | se
| {cspell,simple} | cspell: {}blank | Space symbols | | {} |lword | Latin word | napil
|{cspell,simple} | simple: {napil}blank | Space symbols | | {} |word | Word | žluté
| {cspell,simple} | simple: {žluté}blank | Space symbols | | {} |lword | Latin word | vody
| {cspell,simple} | simple: {vody}
(13 rows)
This query returned true in 8.2 and now:
postgres=# select to_tsvector('cs','Příliš žlutý kůň se napil žluté
vody') @@ to_tsquery('cs','napít');?column?
----------f
(1 row)
Regards
Pavel Stehule