Re: to_tsvector in 8.2.3
От | Teodor Sigaev |
---|---|
Тема | Re: to_tsvector in 8.2.3 |
Дата | |
Msg-id | 46014E9B.1080301@sigaev.ru обсуждение исходный текст |
Ответ на | Re: to_tsvector in 8.2.3 (Thomas Pundt <mlists@rp-online.de>) |
Список | pgsql-general |
8.2 has fully rewritten text parser based on POSIX is* functions. Thomas Pundt wrote: > On Wednesday 21 March 2007 14:25, Teodor Sigaev wrote: > | I can't reproduce your problem, but I have not Windows box, can anybody > | reproduce that? > > just a guess in the wild; I once had a similar phenomen and tracked it down > to a "non breaking space character" (0xA0). Since then I'm patching the > tsearch2 lexer: > > --- postgresql-8.1.5/contrib/tsearch2/wordparser/parser.l > +++ postgresql-8.1.4/contrib/tsearch2/wordparser/parser.l > @@ -78,8 +78,8 @@ > /* cyrillic koi8 char */ > CYRALNUM [0-9\200-\377] > CYRALPHA [\200-\377] > -ALPHA [a-zA-Z\200-\377] > -ALNUM [0-9a-zA-Z\200-\377] > +ALPHA [a-zA-Z\200-\237\241-\377] > +ALNUM [0-9a-zA-Z\200-\237\241-\377] > > > HOSTNAME ([-_[:alnum:]]+\.)+[[:alpha:]]+ > @@ -307,7 +307,7 @@ > return UWORD; > } > > -[ \r\n\t]+ { > +[ \240\r\n\t]+ { > token = tsearch2_yytext; > tokenlen = tsearch2_yyleng; > return SPACE; > > > Ciao, > Thomas > -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
В списке pgsql-general по дате отправления: