Re: Postgresql GROUP BY "SIMILAR" but not equal values
От | Alban Hertroys |
---|---|
Тема | Re: Postgresql GROUP BY "SIMILAR" but not equal values |
Дата | |
Msg-id | CAF-3MvN9WLFhZA-XDdZNxTKCuJREYLLe1uMQ+6keEdL9Z5gMjw@mail.gmail.com обсуждение исходный текст |
Ответ на | Postgresql GROUP BY "SIMILAR" but not equal values (alexandros_e <alexandros.ef@gmail.com>) |
Ответы |
Re: Postgresql GROUP BY "SIMILAR" but not equal values
Re: Postgresql GROUP BY "SIMILAR" but not equal values |
Список | pgsql-general |
On 6 February 2014 16:18, alexandros_e <alexandros.ef@gmail.com> wrote: > Let's say I have this table foo > > ID|G1|T1| > 1|2|ABC| > 1|2|ABCD| > 1|2|DEF| > 1|2|DEFG| > > SELECT * FROM foo > GROUP BY ID,G1,T1 > Is there a way in SQL or PostgreSQL in general to group by values than are > not exactly the same but are quite similar (like 'ABC' and 'ABCD') based on > some distance function (levenshtein for example) if the distance is within > some threshold (i.e., 1) Perhaps there is: You can calculate the levenshtein distance between those values using a self-join and then GROUP BY the result of that expression and limit the results with HAVING. For example: SELECT foo1.ID, foo1.G1, foo1.T1, levenshtein(foo1.T1, foo2.T1) FROM foo foo1 INNER JOIN foo foo2 ON (foo2.ID = foo1.ID AND foo2.G1 = foo1.G1) GROUP BY foo1.ID, foo1.G1, foo1.T1, levenshtein(foo1.T1, foo2.T1) HAVING levenshtein(foo1.T1, foo2.T1) > 1 Is that what you're looking for? -- If you can't see the forest for the trees, Cut the trees and you'll see there is no forest.
В списке pgsql-general по дате отправления: