Re: Comparing tsearch2 vectors.
От | Achilleus Mantzios |
---|---|
Тема | Re: Comparing tsearch2 vectors. |
Дата | |
Msg-id | Pine.LNX.4.44.0407130929120.5904-100000@matrix.gatewaynet.com обсуждение исходный текст |
Ответ на | Re: Comparing tsearch2 vectors. (Rajesh Kumar Mallah <mallah@trade-india.com>) |
Список | pgsql-sql |
O kyrios Rajesh Kumar Mallah egrapse stis Jul 13, 2004 : > Achilleus Mantzios wrote: > > >O kyrios Rajesh Kumar Mallah egrapse stis Jul 12, 2004 : > > > > > > > >>Achilleus Mantzios wrote: > >> > >> > >> > >>>O kyrios Rajesh Kumar Mallah egrapse stis Jul 12, 2004 : > >>> > >>> > >>> > >>> > >>> > >>>>Dear Mantzios, > >>>> > >>>>I have to get set of banners from database in > >>>>response to a search term. I want that the search term > >>>>be compared to the keyword corresponding to the > >>>>banners stored in database. current i am doing an > >>>>equality match but i woild like to do it after stemming > >>>>both the sides (serch term and keywords). > >>>> > >>>> > >>>> > >>>> > >>>You could transform your search terms so that there is the "&" > >>>separator between them. (& stands for "AND"). > >>>E.g. "handicrafts exporter" becomes "handicrafts&exporter" > >>>And then > >>>select * from <your table> where idxfti @@ to_tsquery(<searchterms>); > >>> > >>> > >>> > >>> > >>But i do not want 'handicraft exporters of delhi' to pop out if i search > >>for 'handicrafts exporters' whereas > >> > >>SELECT to_tsvector('handycrafts exporters of delhi') @@ to_tsquery('handycraft&exporting'); > >> > >>will be true. > >> > >> > > > >Define what you want, and then read tsearch2 userguide. > >I'm sure you'll find your way :) > > > > > The requirement is different than full text search. > I am not searching a word in a collection of words (text) > rather comparing two strings after all the words in those > strings are stemmed. Hope my requirement is clear now. Ok, so we drop back to the initial assumption. Tokenize both strings into an array of strings. Let them be String[] string1,String[] string2 If arrays are not of same length then they are not equal. Otherwise for each i in string1 compare lexize(<your stem dict>,string1[i]) against lexize(<your stem dict>,string2[i]) The tokenization is your job, while the lexize function comes with tsearch2. I dont know if its possible to be done in sql, since it requires some sort of iteration. > > > Regds > mallah. > > > > > > > > > >>Regds > >>Mallah. > >> > >> > >> > >> > >> > >>>where idxfti is your tsvector column. > >>> > >>>E.g. > >>># SELECT to_tsvector('handycrafts exporters') @@ to_tsquery('handycraft&exporting'); > >>>?column? > >>>---------- > >>>t > >>>(1 row) > >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>>>So that the banners for the adword say 'incense exporter' is > >>>>shown even if 'incenses exporter' or 'incense exporters' is > >>>>searched. > >>>> > >>>>I hope i am able to clarify. > >>>> > >>>>Regds > >>>>Mallah. > >>>> > >>>>Achilleus Mantzios wrote: > >>>> > >>>> > >>>> > >>>> > >>>> > >>>>>O kyrios Rajesh Kumar Mallah egrapse stis Jul 12, 2004 : > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>>>Hi, > >>>>>> > >>>>>>We want to compare strings after stemming. Can anyone > >>>>>>tell me what is the best method. I was thinking to compare > >>>>>>the tsvector ,but there is no operator for that. > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>I'd tokenize each string and then apply lexize() to get the > >>>>>equivalent stemified > >>>>>word, but what exactly are you trying to accomplish? > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>>>>Regds > >>>>>>Mallah. > >>>>>> > >>>>>> > >>>>>> > >>>>>>tradein_clients=# SELECT to_tsvector('handicraft exporters'); > >>>>>>+---------------------------+ > >>>>>>| to_tsvector | > >>>>>>+---------------------------+ > >>>>>>| 'export':2 'handicraft':1 | > >>>>>>+---------------------------+ > >>>>>>(1 row) > >>>>>> > >>>>>>Time: 710.315 ms > >>>>>>tradein_clients=# > >>>>>>tradein_clients=# SELECT to_tsvector('handicrafts exporter'); > >>>>>>+---------------------------+ > >>>>>>| to_tsvector | > >>>>>>+---------------------------+ > >>>>>>| 'export':2 'handicraft':1 | > >>>>>>+---------------------------+ > >>>>>>(1 row) > >>>>>> > >>>>>>Time: 400.679 ms > >>>>>>tradein_clients=# SELECT to_tsvector('Hi there') = to_tsvector('Hi there'); > >>>>>>ERROR: operator does not exist: tsvector = tsvector > >>>>>>HINT: No operator matches the given name and argument type(s). You may > >>>>>>need to add explicit type casts. > >>>>>>tradein_clients=# > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>> > >>>>> > >>>>> > >>>>> > >>>> > >>>> > >>>> > >>>> > >>> > >>> > >>> > >>> > >> > >> > >> > > > > > > > > > -- -Achilleus
В списке pgsql-sql по дате отправления: