Re: Finding (and deleting) dupes in relation table
От | Markus Wollny |
---|---|
Тема | Re: Finding (and deleting) dupes in relation table |
Дата | |
Msg-id | 2266D0630E43BB4290742247C89105750813EC8D@dozer.computec.de обсуждение исходный текст |
Ответ на | Finding (and deleting) dupes in relation table (CSN <cool_screen_name90001@yahoo.com>) |
Список | pgsql-general |
Hello! CSN [cool_screen_name90001@yahoo.com] wrote: > I have a table that relates id's of two other tables: > > table1id, table2id > > Dupes have found their way into it (create unique index > across both fields fails). Is there a quick and easy way to > find and delete the dupes (there are tens of thousands of records)? > > Thanks, > CSN If your table was created WITH OIDS you could identify the duplicates thus: select a.table1id , a.table12d , max(a.oid) as maxoid , count(a.oid) as coid from schema.mytable a, schema.mytable b where a.table1id = b.table1id and a.table2id=b.table2id and a.oid <> b.oid group by a.table1id, a.table2id order by a.table1id; If you wish to delete surplus rows, you might do the following: delete from schema.mytable where oid in ( select maxoid from ( select a.table1id, a.table12d, max(a.oid) as maxoid, count(a.oid) as coid from schema.mytable a, schema.mytable b where a.table1id = b.table1id and a.table2id=b.table2id and a.oid <> b.oid group by a.table1id, a.table2id order by a.table1id ) as foo where coid >1 ); This will delete the oldest tuple of a duplicate set of rows; if there are more than two tuples in a set, you'll want to execute this a couple of times until there's no duplicate left, as the delete will only reduce a set by one tuple at a time. I'd also recommend to apply a PRIMARY KEY constraint afterwards instead of just a unique index - this will prevent NULL-entries as well as creating the desired unique index - and I think it's good practice to have a primary key on about every table there is, except when it's just a junk data table like a logging table where content is regularly evaluated and discarded. Kind regards Markus
В списке pgsql-general по дате отправления: