Re: String Similarity
От | Mark Woodward |
---|---|
Тема | Re: String Similarity |
Дата | |
Msg-id | 18219.24.91.171.78.1148073023.squirrel@mail.mohawksoft.com обсуждение исходный текст |
Ответ на | Re: String Similarity (Mark Dilger <pgsql@markdilger.com>) |
Список | pgsql-hackers |
> Mark Woodward wrote: >> I have a side project that needs to "intelligently" know if two strings >> are contextually similar. Think about how CDDB information is collected >> and sorted. It isn't perfect, but there should be enough information to >> be >> usable. >> >> Think about this: >> >> "pink floyd - dark side of the moon - money" >> "dark side of the moon - pink floyd - money" >> "money - dark side of the moon - pink floyd" >> etc. >> >> To a human, these strings are almost identical. Similarly: >> >> "dark floyd of money moon pink side the" >> >> Is a puzzle to be solved by 13 year old children before the movie >> starts. [snip] > > Hmmm... I think I like this problem. Maybe I'll work on it a bit as a > contrib > module. I *have* a working function, but it is not very efficient and it is not what I would call numerically predictable. And it does find the various sub-strings between the two strings in question. Email me offline and we can make something for contrib.
В списке pgsql-hackers по дате отправления: