Re: full text search, utf8
От | alexander lunyov |
---|---|
Тема | Re: full text search, utf8 |
Дата | |
Msg-id | 4A26650B.400@zato.ru обсуждение исходный текст |
Ответ на | full text search, utf8 (alexander lunyov <lan@zato.ru>) |
Ответы |
Re: full text search, utf8
|
Список | pgsql-ru-general |
I can answer in english if you like. This error happening also when i'm trying to CREATE TEXT SEARCH DICTIONARY: ports=# CREATE TEXT SEARCH DICTIONARY ruispell ( ports(# TEMPLATE = ispell, ports(# DictFile = russian, ports(# AffFile = russian, ports(# StopWords = russian ports(# ); ERROR: неверная последовательность байт имя кодировки "UTF8": 0xd1 ПОДСКАЗКА: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding". ports=# All data in table populated with perl script that read text file in UTF8 and make INSERTs, and i think if there was illegal character, error would appear after INSERT. Andrew Boag wrote: > sorry for English response (I don't have Russian keyboard here) > > 0xd1 may be an illegal UTF8 chaacter that was mistakenly allowed into > the table. Not all libraries (or all versions of postgres) prevent > illegal UTF8 characters from getting into DB. > > We saw similar issues with a 7.4 -> 8.1 postgres data migration. > > However, I don't fully understand your select query so there may be > another cause. > > alexander lunyov wrote: >> Здравствуйте. >> >> Имеется freebsd 6.2, postgresql-8.3.1 >> >> В env: >> >> % env | grep UTF >> LANG=ru_RU.UTF-8 >> MM_CHARSET=UTF-8 >> >> % psql ports -U pgsql >> Welcome to psql 8.3.1, the PostgreSQL interactive terminal. >> >> Type: \copyright for distribution terms >> \h for help with SQL commands >> \? for help with psql commands >> \g or terminate with semicolon to execute query >> \q to quit >> >> ports=# \encoding >> UTF8 >> ports=# \l >> Список баз данных >> Имя | Владелец | Кодировка >> -----------+----------+----------- >> ports | pgsql | UTF8 >> postgres | pgsql | UTF8 >> template0 | pgsql | UTF8 >> template1 | pgsql | UTF8 >> (4 rows) >> >> Пробую поискать в таблице, и вот результат: >> >> ports=# select name from abonents where to_tsvector(name) @@ >> to_tsquery('s'); >> ERROR: неверная последовательность байт имя кодировки "UTF8": 0xd1 >> ПОДСКАЗКА: This error can also happen if the byte sequence does not >> match the encoding expected by the server, which is controlled by >> "client_encoding". >> >> при этом в конфигурации english работает нормально. >> >> # select count(name) from abonents where to_tsvector('english',name) >> @@ to_tsquery('some'); >> count >> ------- >> 6 >> (1 запись) >> >> Почему? >> > > -- С уважением Александр Лунев ОАО РТК
В списке pgsql-ru-general по дате отправления: