Re: [GENERAL] Incorrect FTS result with GIN index
От | Oleg Bartunov |
---|---|
Тема | Re: [GENERAL] Incorrect FTS result with GIN index |
Дата | |
Msg-id | Pine.LNX.4.64.1007291459270.32129@sn.sai.msu.ru обсуждение исходный текст |
Ответ на | Re: [GENERAL] Incorrect FTS result with GIN index (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: [GENERAL] Incorrect FTS result with GIN index
|
Список | pgsql-hackers |
Tom, we're not able to work on this right now, so go ahead if you have time. I also wonder why did I get "right" result :) Just repeated the query: test=# select count(*) from search_tab where (to_tsvector('german', keywords ) @@ to_tsquery('german', 'ee:* & dd:*')); count ------- 123 (1 row) Time: 26.185 ms Oleg On Wed, 28 Jul 2010, Tom Lane wrote: > Oleg Bartunov <oleg@sai.msu.su> writes: >> you can download dump http://mira.sai.msu.su/~megera/tmp/search_tab.dump > > Hmm ... I'm not sure why you're failing to reproduce it, because it's > falling over pretty easily for me. After poking at it for awhile, > I am of the opinion that scanGetItem's handling of multiple keys is > fundamentally broken and needs to be rewritten completely. The > particular case I'm seeing here is that one key returns this sequence of > TIDs/lossy flags: > > ... > 1085/4 0 > 1086/65535 1 > 1087/4 0 > ... > > while the other one returns this: > > ... > 1083/11 0 > 1086/6 0 > 1086/10 0 > 1087/10 0 > ... > > and what comes out of scanGetItem is just > > ... > 1086/6 1 > ... > > because after returning that, on the next call it advances both input > keystreams. So 1086/10 should be visited and is not. > > I think that depending on the previous entryRes state to determine what > to do is basically unworkable, and what should probably be done instead > is to remember the last-returned TID and advance keystreams with TIDs <= > that. I haven't quite thought through how that should interact with > lossy-page TIDs but it seems more robust than what we've got. > > I'm also noticing that the ANDing behavior for the "ee:* & dd:*" query > style seems very much stupider than it needs to be --- it's returning > lossy pages that very obviously don't need to be examined because the > other keystream has no match at all on that page. But I haven't had > time to probe into the reason why. > > I'm out of time for today, do you want to work on it? > > regards, tom lane > Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83
В списке pgsql-hackers по дате отправления: