Re: Slow duplicate deletes
От | Michael Wood |
---|---|
Тема | Re: Slow duplicate deletes |
Дата | |
Msg-id | CAP6d-HWTDgsbf2-Rr+umdm4ZboEFYf2RMLNeLebZ-OhueaCa=Q@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Slow duplicate deletes (DrYSG <ygutfreund@draper.com>) |
Список | pgsql-novice |
On 5 March 2012 22:43, DrYSG <ygutfreund@draper.com> wrote: > One point I might not have made clear. The reason I want to remove duplicates > is that the column "data_object.unique_id" became non-unique (someone added > duplicate rows). So I added the bigSeriel (idx) to uniquely identify the > rows, and I was using the SELECT MIN(idx) and GroupBy to pick just one of > the rows that became duplicated. > > I am going to try out some of your excellent suggestions. I will report back > on how they are working. > > One idea that was given to me was the following (what do you think Merlin?) > > CREATE TABLE portal.new_metatdata AS > select distinct on (data_object.unique_id) * FROM portal.metadata; > > Or something of this ilk should be faster because it only need to do a > sort on data_object.unique_id and then an insert. After you have > verified the results you can do: > > BEGIN; > ALTER TABLE portal.metatdata rename TO portal.new_metatdata_old; > ALTER TABLE portal.new_metatdata rename TO portal.metatdata_old; > COMMIT; This sounds like a good way to go, but if you have foreign keys pointing at portal.metadata I think you will need to drop and recreate them again after the rename. -- Michael Wood <esiotrot@gmail.com>
В списке pgsql-novice по дате отправления: