Re: String Similarity
От | Christopher Kings-Lynne |
---|---|
Тема | Re: String Similarity |
Дата | |
Msg-id | 44712CDE.1090608@calorieking.com обсуждение исходный текст |
Ответ на | String Similarity ("Mark Woodward" <pgsql@mohawksoft.com>) |
Ответы |
Re: String Similarity
|
Список | pgsql-hackers |
Try contrib/pg_trgm... Chris Mark Woodward wrote: > I have a side project that needs to "intelligently" know if two strings > are contextually similar. Think about how CDDB information is collected > and sorted. It isn't perfect, but there should be enough information to be > usable. > > Think about this: > > "pink floyd - dark side of the moon - money" > "dark side of the moon - pink floyd - money" > "money - dark side of the moon - pink floyd" > etc. > > To a human, these strings are almost identical. Similarly: > > "dark floyd of money moon pink side the" > > Is a puzzle to be solved by 13 year old children before the movie starts. > > My post has three questions: > > (1) Does anyone know of an efficient and numerically quantified method of > detecting these sorts of things? I currently have a fairly inefficient and > numerically bogus solution that may be the only non-impossible solution > for the problem. > > (2) Does any one see a need for this feature in PostgreSQL? If so, what > kind of interface would be best accepted as a patch? I am currently > returning a match liklihood between 0 and 100; > > (3) Is there also a desire for a Levenshtein distence function for text > and varchars? I experimented with it, and was forced to write the function > in item #1. > > > ---------------------------(end of broadcast)--------------------------- > TIP 1: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly -- Christopher Kings-Lynne Technical Manager CalorieKing Tel: +618.9389.8777 Fax: +618.9389.8444 chris.kings-lynne@calorieking.com www.calorieking.com
В списке pgsql-hackers по дате отправления: