Re: BUG #8354: stripped positions can generate nonzero rank in ts_rank_cd
От | Alexander Hill |
---|---|
Тема | Re: BUG #8354: stripped positions can generate nonzero rank in ts_rank_cd |
Дата | |
Msg-id | CA+KBOKxsaU7Q-Qc6-YV99AY1U_Rb0SbR78Bs5xM6=12PRMKJKA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: BUG #8354: stripped positions can generate nonzero rank in ts_rank_cd (Bruce Momjian <bruce@momjian.us>) |
Ответы |
Re: BUG #8354: stripped positions can generate nonzero rank
in ts_rank_cd
|
Список | pgsql-bugs |
Hi Bruce, all, I think this can be solved (if it's agreed that it's a bug) in a pretty straightforward way: when creating the document representation used in calculating cover density rank, we can just skip lexemes with no position entirely. Fix and tests here: https://github.com/AlexHill/postgres/compare/bug_8354 As a patch file here: https://github.com/AlexHill/postgres/commit/cd522b254d166d569b86803115f0f499864e949b.patch Cheers, Alex On Sat, Feb 1, 2014 at 5:22 AM, Bruce Momjian <bruce@momjian.us> wrote: > > Would someone please comment on this text search bug report? Thanks. > > --------------------------------------------------------------------------- > > On Fri, Aug 2, 2013 at 07:03:42AM +0000, alex@hill.net.au wrote: > > The following bug has been logged on the website: > > > > Bug reference: 8354 > > Logged by: Alex Hill > > Email address: alex@hill.net.au > > PostgreSQL version: 9.2.4 > > Operating system: OS X 10.8.4 Mountain Lion > > Description: > > > > Hi all, > > > > > > The docs for ts_rank_cd state: > > > > > > "This function requires positional information in its input. Therefore it > > will not work on "stripped" tsvector values -- it will always return > zero." > > > > > > However if a tsvector contains some stripped lexemes and some > non-stripped, > > ts_rank_cd will rank extents including the non-stripped values. > > > > > > For example, this evaluates to zero as expected: > > > > > > SELECT ts_rank_cd(strip(to_tsvector('text search')), > > plainto_tsquery('text search')) > > > > > > > > > > But this doesn't: > > > > > > SELECT ts_rank_cd(to_tsvector('text') || > strip(to_tsvector('search')), > > plainto_tsquery('text search')) > > > > > > > > > > I think this is a bug, if not in the code then in the documentation, > which > > isn't clear on what happens when stripped and positioned lexemes are > mixed > > in one tsvector. > > > > > > I would prefer that stripped lexemes were completely ignored by > ts_rank_cd: > > my use case is using this as a fifth pseudo-weight, which matches a @@ > query > > but doesn't add to a ts_rank_cd ranking. > > > > > > What do you think? > > > > > > Cheers, > > Alex > > > > > > > > -- > > Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) > > To make changes to your subscription: > > http://www.postgresql.org/mailpref/pgsql-bugs > > -- > Bruce Momjian <bruce@momjian.us> http://momjian.us > EnterpriseDB http://enterprisedb.com > > + Everyone has their own god. + >
В списке pgsql-bugs по дате отправления: