Re: Google Summer of Code 2008
От | Oleg Bartunov |
---|---|
Тема | Re: Google Summer of Code 2008 |
Дата | |
Msg-id | Pine.LNX.4.64.0803090532050.10010@sn.sai.msu.ru обсуждение исходный текст |
Ответ на | Re: Google Summer of Code 2008 (Jan Urbański <j.urbanski@students.mimuw.edu.pl>) |
Ответы |
Text search selectivity improvements (was Re: Google Summer
of Code 2008)
|
Список | pgsql-hackers |
On Sat, 8 Mar 2008, Jan Urbaski wrote: > >> Unfortunately, selectivity estimation for query is much difficult than just >> estimate frequency of individual word. > > Sure, given something like 'cats & dogs'::tsquery the frequency of 'cat' and > 'dog' won't suffice. But at least it's a starting point and if we estimate > that 80% of the documents have 'dog' and 70% have 'cat' then we can tell for > sure that at least 50% have both and that's a lot better than 0.1% that's > being returned now. certainly yes and given that most popular queries are single word query this would very helpful in most cases. The reason I though about ts_stat() improvement is that we could use its statistics for incomplete search feature people requested, when AND query like ( a & b &c ) rewrites to a set of AND|OR queries depending on the terms occurency. Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
В списке pgsql-hackers по дате отправления: