Re: gsoc, oprrest function for text search
От | Jan Urbański |
---|---|
Тема | Re: gsoc, oprrest function for text search |
Дата | |
Msg-id | 488EC64F.20701@students.mimuw.edu.pl обсуждение исходный текст |
Ответ на | Re: gsoc, oprrest function for text search ("Heikki Linnakangas" <heikki@enterprisedb.com>) |
Список | pgsql-hackers |
Heikki Linnakangas wrote: > Jan Urbański wrote: >> Here's a WIP patch implementing an oprrest function for tsvector @@ >> tsquery and tsquery @@ tsvector. >> >> The idea is (quoting a comment) >> /* >> * Traverse the tsquery preorder, calculating selectivity as: >> * >> * selec(left_oper) * selec(right_oper) in AND nodes, >> * >> * selec(left_oper) + selec(right_oper) - >> * selec(left_oper) * selec(right_oper) in OR nodes, >> * >> * 1 - select(oper) in NOT nodes >> * >> * freq[val] in VAL nodes, if the value is in MCELEM >> * min(freq[MCELEM]) / 2 in VAL nodes, if it is not > > Seems reasonable. > >> * >> * Implementation-wise, we sort the MCELEM array to use binary >> * search on it. >> */ > > Would it be possible to store the array in sorted order, to avoid > sorting it on every invocation of tssel? It's being stored sorted on frequencies, like so: [('dog', 0.9), ('cat', 0.8), ('sheep', 0.7)] and I need it sorted on elements for bsearch(). I don't know if it's OK to break the rule that statistical data is stored sorted on freqneucies. If so, then ts_typanalyze() would have to change and do one more qsort() before storing the result. -- Jan Urbanski GPG key ID: E583D7D2 ouden estin
В списке pgsql-hackers по дате отправления: