Re: duplicates
От | Moray McConnachie |
---|---|
Тема | Re: duplicates |
Дата | |
Msg-id | 012301bfa089$82ffa920$760e01a3@oucs.ox.ac.uk обсуждение исходный текст |
Ответ на | duplicates (Allan Kelly <allan.kelly@buildstore.co.uk>) |
Список | pgsql-sql |
----- Original Message ----- From: "Allan Kelly" <allan.kelly@buildstore.co.uk> To: <pgsql-sql@postgresql.org> Sent: Friday, April 07, 2000 11:34 AM Subject: [SQL] duplicates > I have a system bug which means that I have occasional dupicate entries in > my 'subscribers' table. I'm trying to find a query which performs something > like a 'where user_name is not unique' query. > > At the moment we use this cludge: > > select count(*), user_name from subscribers > group by user_name order by count; > > (I'm surprised I can't add 'where count > 1' but it seems conditionals on > aggregate fields are not allowed). > > This gives me a very long list with the 'not unique' entries at the bottom, eg > > count | user_name > ------+------------------ > 1 | bill.hicks > [ ..cut 9 zillion results.. ] > 1 | margaret.thatcher > 4 | linus.torvalds > 9 | bill.gates > > I then have to do > > select oid from subscribers where user_name = 'linus.torvalds'; > > and delete all but one of the records. Is there a better way to do this? DELETE FROM subscribers WHERE EXISTS(SELECT x.username FROM subscribers x WHERE subscribers.oid <x.oid AND subscribers.username LIKE x.username); ought to do it, I *think*. It will delete all but the most recently inserted copy, i.e. that with the highest oid. Self-joins (which this is, effectively), are often quicker and more efficient than counts. Yours, Moray
В списке pgsql-sql по дате отправления: