Re: TSearch2 / Get all unique lexems

Поиск

Список

Период

Сортировка

От	Hannes Dorbath
Тема	Re: TSearch2 / Get all unique lexems
Дата	8 декабря 2005 г. 07:49:18
Msg-id	4397F3D4.8080004@theendofthetunnel.de обсуждение исходный текст
Ответ на	Re: TSearch2 / Get all unique lexems (Oleg Bartunov <oleg@sai.msu.su>)
Ответы	Re: TSearch2 / Get all unique lexems (Teodor Sigaev <teodor@sigaev.ru>) Re: TSearch2 / Get all unique lexems (Oleg Bartunov <oleg@sai.msu.su>)
Список	pgsql-general

Дерево обсуждения

On 07.12.2005 16:13, Oleg Bartunov wrote:
> hmm, you could dump tsvector column and use awk+sort+uniq

Thanks. I hoped for something possible inside a pl/pgsql proc. I'm
trying to integrate pg_trgm with Tsearch2. I'm still on my UTF-8
database. Yes I know, there is _NO_ UTF-8 support of any kind in
Tsearch2 yet, but I got it working to a degree that is OK for my
application (Created my own stemmer variant, ispell dict, affix file
etc). The last missing bit is to get a source for pg_trgm. I cannot use
the the stat() function, because it breaks as soon it sees an UTF-8 char.

I thought of using lexise(), cast the text array to rows somehow, write
it to a temp table, use SELECT DISTINCT.. but I hadn't any success yet.


--
Regards,
Hannes Dorbath

В списке pgsql-general по дате отправления:

Предыдущее

От: hubert depesz lubaczewski
Дата: 08 декабря 2005 г., 07:20:12
Сообщение: Re: tables with lots of columns - what alternative from performance point of view?

Следующее

От: Peter Eisentraut
Дата: 08 декабря 2005 г., 08:52:22
Сообщение: Re: Help on collation and accent sensitivity

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: TSearch2 / Get all unique lexems

Предыдущее

Следующее