Re: How to get rid of dups...
От | Kevin Brannen |
---|---|
Тема | Re: How to get rid of dups... |
Дата | |
Msg-id | 3D2DB8AF.30901@nurseamerica.net обсуждение исходный текст |
Ответ на | Re: I am being interviewed by OReilly ("Jeff MacDonald" <jeff@tsunamicreek.com>) |
Список | pgsql-general |
Jeremy Cowgar wrote: > I need to get rid of all rows that have dups in the columns > tpa,pun,grn,claim ... i.e. > > 1--- 001 001 001 00-000001 John Doe > 2--- 001 001 001 00-000001 Jane Doe > 3--- 001 002 001 00-000001 John Doe > > 1 and 2 would be dups, 1 and 3 are diff records, 2 and 3 are diff > records. > > I tried this as a test: > > select count(claimid), tpa, pun, grn, claim FROM claim_import GROUP BY > tpa, pun, grn, claim HAVING count(claimid) > 1; > 26 rows returned. > > then > > select distinct on (tpa,pun,grn,claim) count(claimid), tpa, pun, grn, > claim FROM claim_import GROUP BY tpa, pun, grn, claim HAVING > count(claimid) > 1; It's not obvious to me what your key(s) is (all 3 columns?), but this is a place where self-joins are useful. Assuming a table like: create table stuff ( id int, -- primary table key value int, -- unique data key ...); You should be able to find the dups with something like: select b.id from stuff a, stuff b where a.value = b.value and a.id < b.id; Given that, then use it to get: delete from stuff where id in (select b.id from stuff a, stuff b where ...); Be careful and experiment with the select until you're 110% sure you like what you see. :-) Adapt this approach to your real table and you should be set. HTH, Kevin
В списке pgsql-general по дате отправления: