Re: [HACKERS] new function for tsquery creartion
От | Victor Drobny |
---|---|
Тема | Re: [HACKERS] new function for tsquery creartion |
Дата | |
Msg-id | 3e1b0851d3a6a2da42f78d31cc241d0b@postgrespro.ru обсуждение исходный текст |
Ответ на | Re: [HACKERS] new function for tsquery creartion (Alexey Chernyshov <a.chernyshov@postgrespro.ru>) |
Список | pgsql-hackers |
On 2017-10-13 16:37, Alexey Chernyshov wrote: > Hi all, > I am extending phrase operator <n> is such way that it will have <n,m> > syntax that means from n to m words, so I will use such syntax (<n,m>) > further. I found that a AROUND(N) b is exactly the same as a <-N,N> b > and it can be replaced while parsing. So, what do you think of such > idea? In this patch I have noticed some unobvious behavior. Thank you for the interest and review! > # select to_tsvector('Hello, cat world!') @@ queryto_tsquery('cat > AROUND(1) cat') as match; > match > ------- > t > > cat AROUND(1) cat is the same is "cat <1> cat || cat <0> cat" and: > > # select to_tsvector('Hello, cat world!') @@ to_tsquery('cat <0> cat'); > ?column? > ------- > t > > It seems to be a proper logic behavior but it is a possible pitfall, > maybe it should be documented? It is a tricky question. I think that this interpretation is confusing, so better to make it as <-N, -1> and <1, N>. > But more important question is how AROUND() operator should handle stop > words? Now it works as: > > # select queryto_tsquery('cat <2> (a AROUND(10) rat)'); > queryto_tsquery > ------------------ > 'cat' <12> 'rat' > (1 row) > > # select queryto_tsquery('cat <2> a AROUND(10) rat'); > queryto_tsquery > ------------------------ > 'cat' AROUND(12) 'rat' > (1 row) > > In my opinion it should be like: > cat <2> (a AROUND(10) rat) == cat <2,2> (a <-10,10> rat) == cat <-8,12> > rat I think that correct version is: cat <2> (a AROUND(10) rat) == cat <2,2> (a <-10,10> rat) == cat <-2,12> rat. > cat <2> a AROUND(10) rat == cat <2,2> a <-10,10> rat = cat <-8, 12> > rat It is a problem indeed. I did not catch it during implementation. Thank you for pointing it out. > Now <n,m> operator can be replaced with combination of phrase > operator <n>, AROUND(), and logical operators, but with <n,m> operator > it will be much painless. Correct me, please, if I am wrong. I think that <n,m> operator is more general than around(n) so the last one should be based on yours. However, i think, that taking negative parameters is not the best idea because it is confusing. On top of that it is not so necessary and i think it won`t be popular among users. It seems to me that AROUND operator can be easily implemented with <n,m>, also, it helps to avoid problems, that you showed above. -- Victor Drobny Postgres Professional: http://www.postgrespro.com The Russian Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
В списке pgsql-hackers по дате отправления: