"Fuzzy" Matches on Nicknames
От | Michael Sheaver |
---|---|
Тема | "Fuzzy" Matches on Nicknames |
Дата | |
Msg-id | 18DF7A91-78F6-4F63-8A7E-BEBE3AEE7AC6@me.com обсуждение исходный текст |
Ответы |
Re: "Fuzzy" Matches on Nicknames
|
Список | pgsql-general |
Greetings, I have two tables that are populated using large datasets from disparate external systems, and I am trying to match recordsby customer name between these two tables. I do not have any authoritative key, such as customerID or nationalID,by which I can match them up, and I have found many cases where the same customer has different first names inthe two datasets. A sampling of the differences is as follows: Michael <=> Mike Tom <=> Thomas Liz <=> Elizabeth Margaret <=> Maggie How can I build a query in PostgreSQL (v. 9.6) that will find possible matches like these on nicknames? My initial guessis that I would have to either find or build some sort of intermediary table that contains associated names like thoseabove. Sometimes though, there will be more than matching pairs, like: Jim <=> James <=> Jimmy <=> Jimmie Bill <=> Will <=> Willie <=> William and so forth. Has anyone used or developed PostgreSQL queries that will find matches like these? I am running all my database queries.on my local laptops (Win7 and macOS), so performance or uptime is no issue here. I am curious to see how others inthis community have creatively solved this common problem. One of the PostgreSQL dictionaries (synonym, thesaurus etc.) might work here, but honestly I am clueless as to how to setthis up or use it in queries successfully. Thanks, Michael (aka Mike, aka Mikey)
В списке pgsql-general по дате отправления: