Re: Bunch of tsearch fixes and cleanup
От | Heikki Linnakangas |
---|---|
Тема | Re: Bunch of tsearch fixes and cleanup |
Дата | |
Msg-id | 46CDA03C.2010703@enterprisedb.com обсуждение исходный текст |
Ответ на | Re: Bunch of tsearch fixes and cleanup (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-patches |
Tom Lane wrote: > "Heikki Linnakangas" <heikki@enterprisedb.com> writes: >> - readstopwords calls recode_and_lowerstr directly, instead of using the >> "wordop" function pointer in StopList struct. All callers used >> recode_and_lowerstr anyway, so this simplifies the code a little bit. Is >> there any external dictionary implementations that would require >> different behavior? > > I don't think eliminating wordop altogether is such a hot idea; some > dictionary could possibly want to do different processing than that. Ok. > Something that was annoying me yesterday was that it was not clear > whether we had fixed every single place that uses a tsearch config file > to assume that the file is in UTF8 and should be converted to database > encoding. I'm afraid there's still a lot of inconsistencies in that. I'm just looking at dict_synonym, and it looks like it has the same problem I patched in readstopwords; it's using pg_verifymbstr, with database encoding, to verify the input file. It also seems to be calling pg_mblen, which depends on database encoding, against UTF-8 encoded strings. I'll look at those more closely.. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com
В списке pgsql-patches по дате отправления: