Re: De-duplicating rows

Поиск

Список

Период

Сортировка

От	Frank Bax
Тема	Re: De-duplicating rows
Дата	17 июля 2009 г. 12:14:38
Msg-id	4A609310.4050003@sympatico.ca обсуждение исходный текст
Ответ на	Re: De-duplicating rows (Richard Huxton <dev@archonet.com>)
Список	pgsql-sql

Дерево обсуждения

Richard Huxton wrote:
> Christophe wrote:
>> Now, since this database has been production since 7.2 days, cruft has 
>> crept in: in particular, there are duplicate email addresses, some 
>> with mismatched attributes.  The policy decision by the client is that 
>> the correct row is the one with the earliest timestamp.
> 
> Something like (untested):
> 
> CREATE TEMPORARY TABLE earliest_duplicates AS
> SELECT
>   email AS tgt_email,
>   min(create_date) AS tgt_date
> FROM mytable
> GROUP BY email
> HAVING count(*) > 1;
> 
> DELETE FROM mytable USING earliest duplicates
> WHERE email=tgt_email AND create_date > tgt_date;



If it is possible that two rows exist for the same email/date; then you 
will likely need to deal with these manually.

If you rerun the above SELECT after running the delete you should 
identify these rows.

В списке pgsql-sql по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: De-duplicating rows