Re: Bunch of tsearch fixes and cleanup
От | Oleg Bartunov |
---|---|
Тема | Re: Bunch of tsearch fixes and cleanup |
Дата | |
Msg-id | Pine.LNX.4.64.0708232152540.2727@sn.sai.msu.ru обсуждение исходный текст |
Ответ на | Re: Bunch of tsearch fixes and cleanup (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-patches |
On Thu, 23 Aug 2007, Tom Lane wrote: > "Heikki Linnakangas" <heikki@enterprisedb.com> writes: >> - readstopwords calls recode_and_lowerstr directly, instead of using the >> "wordop" function pointer in StopList struct. All callers used >> recode_and_lowerstr anyway, so this simplifies the code a little bit. Is >> there any external dictionary implementations that would require >> different behavior? > > I don't think eliminating wordop altogether is such a hot idea; some > dictionary could possibly want to do different processing than that. > > Something that was annoying me yesterday was that it was not clear > whether we had fixed every single place that uses a tsearch config file > to assume that the file is in UTF8 and should be converted to database > encoding. So I was thinking of hardwiring the "recode" part into > readstopwords, and using wordop just for the "lowercase" part, which > seemed to me like a saner division of labor. That is, UTF8 is a policy > that we want to enforce globally, but lowercasing maybe not, and this > still leaves the door open for more processing besides lowercasing. > > Oleg, Teodor, what do you think about this? > I agrre with utf-8 recoding and please, don't lowercase. Dictionaries are very different. > regards, tom lane > Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
В списке pgsql-patches по дате отправления: