Re: Tsearch2 and Unicode?
От | Oleg Bartunov |
---|---|
Тема | Re: Tsearch2 and Unicode? |
Дата | |
Msg-id | Pine.GSO.4.61.0411171927480.18871@ra.sai.msu.su обсуждение исходный текст |
Ответ на | Tsearch2 and Unicode? (Dawid Kuroczko <qnex42@gmail.com>) |
Список | pgsql-general |
Dawid, unfortunately, tsearch2 doesn't support unicode yet. If you keep tsvector separately from data than you'll need one more join. Oleg On Wed, 17 Nov 2004, Dawid Kuroczko wrote: > I'm trying to use tsearch2 with database which is in 'UNICODE' encoding. > It works fine for English text, but as I intend to search Polish texts I did: > > insert into pg_ts_cfg('default_polish', 'default', 'pl_PL.UTF-8'); > (and I updated other pg_ts_* tables as written in manual). > > However, Polish-specific chars are being eaten alive, it seems. > I.e. doing select to_tsvector('default_polish', body) from messages; > results in list of words but with national chars stripped... > > I wonder, am I doing something wrong, or just tsearch2 doesn't grok > Unicode, despite the locales setting? This also is a good question > regarding ispell_dict and its feelings regarding Unicode, but that's > another story. > > Assuming Unicode unsupported means I should perhaps... oh, convert > the data to iso8859 prior feeding it to_tsvector()... interesting idea, > but so far I have failed to actually do it. Maybe store the data as > 'bytea' and add a column with encoding information (assuming I don't > want to recreate whole database with new encoding, and that I want > to use unicode for some columns (so I don't have to keep encoding > with every text everywhere...). > > And while we are at it, how do you feel -- an extra column with tsvector > and its index -- would it be OK to keep it away from my data (so I can > safely get rid of them if need be)? > [ I intend to keep index of around 2 000 000 records, few KBs of > text each ]... > > Regards, > Dawid Kuroczko > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faqs/FAQ.html > Regards, Oleg _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83
В списке pgsql-general по дате отправления: