Re: TSearch2 / Get all unique lexems
От | Hannes Dorbath |
---|---|
Тема | Re: TSearch2 / Get all unique lexems |
Дата | |
Msg-id | 4397F3D4.8080004@theendofthetunnel.de обсуждение исходный текст |
Ответ на | Re: TSearch2 / Get all unique lexems (Oleg Bartunov <oleg@sai.msu.su>) |
Ответы |
Re: TSearch2 / Get all unique lexems
Re: TSearch2 / Get all unique lexems |
Список | pgsql-general |
On 07.12.2005 16:13, Oleg Bartunov wrote: > hmm, you could dump tsvector column and use awk+sort+uniq Thanks. I hoped for something possible inside a pl/pgsql proc. I'm trying to integrate pg_trgm with Tsearch2. I'm still on my UTF-8 database. Yes I know, there is _NO_ UTF-8 support of any kind in Tsearch2 yet, but I got it working to a degree that is OK for my application (Created my own stemmer variant, ispell dict, affix file etc). The last missing bit is to get a source for pg_trgm. I cannot use the the stat() function, because it breaks as soon it sees an UTF-8 char. I thought of using lexise(), cast the text array to rows somehow, write it to a temp table, use SELECT DISTINCT.. but I hadn't any success yet. -- Regards, Hannes Dorbath
В списке pgsql-general по дате отправления: