Re: Ts_rank internals
От | Oleg Bartunov |
---|---|
Тема | Re: Ts_rank internals |
Дата | |
Msg-id | Pine.LNX.4.64.0709111118150.2767@sn.sai.msu.ru обсуждение исходный текст |
Ответ на | Re: Ts_rank internals (Teodor Sigaev <teodor@sigaev.ru>) |
Список | pgsql-hackers |
On Tue, 11 Sep 2007, Teodor Sigaev wrote: >> I tried to understand how ts_rank works, but I failed. What does Cover >> function do? How does it work? What is the DocRepresentation data >> structure like? I can see the definition of the struct, and the >> get_docrep function to convert to that format, but by reading those I >> can't figure out what the resulting DocRepresentation looks like. >> I wonder if we could get rid of the istrue flag in QueryOperand, and use >> a local BitmapSet variable instead? It seems wrong to have a temporary >> flag that's only used in one function, in a struct that's used everywhere. > It's a play around CDR algorithms (Cover Density Ranking). > > Based on paper Clarke et al., Relevance Ranking for One to Three Term > Queries. " (http://citeseer.ist.psu.edu/clarke00relevance.html. Sorry, I > lost the article itself, but may be Oleg has it. Simple and short description > is placed at http://www2002.org/CDROM/refereed/643/node7.html. > > We change original algorithm to support weight of lexeme, details are on > Oleg's site: http://www.sai.msu.su/~megera/wiki/NewExtentsBasedRanking Actually, we used two papers http://citeseer.ist.psu.edu/clarke00relevance.html and http://portal.acm.org/ft_gateway.cfm?id=333137&type=pdf&dl=GUIDE&dl=ACM I can send you the latter if you have no access to the ACM. > > Array of DocRepresentation is a representation of document, it contains only > lexemes from both tsvector and tsquery, and lexemes are ordered by position - > as in original doc. Each DocRepresentation has links to corresponding > QueryOperand to optimize query execution while extent search. When we > enlarge current extent for one word then we set istrue flag for corresponding > QueryOperand and execution tsquery from cover becomes very simple task. > > It's possible to eliminate istrue flag, but it's needed to implement > algorithm to execute tsquery over continuos part of document, not over whole > document. > > > > Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
В списке pgsql-hackers по дате отправления: