Re: Slow duplicate deletes
От | Merlin Moncure |
---|---|
Тема | Re: Slow duplicate deletes |
Дата | |
Msg-id | CAHyXU0wD32XGpubZWVOmZE2PL0x2tpYiwSiCXpZf=6RoEz5Fcw@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Slow duplicate deletes (DrYSG <ygutfreund@draper.com>) |
Ответы |
Re: Slow duplicate deletes
|
Список | pgsql-novice |
On Mon, Mar 5, 2012 at 2:43 PM, DrYSG <ygutfreund@draper.com> wrote: > One point I might not have made clear. The reason I want to remove duplicates > is that the column "data_object.unique_id" became non-unique (someone added > duplicate rows). So I added the bigSeriel (idx) to uniquely identify the > rows, and I was using the SELECT MIN(idx) and GroupBy to pick just one of > the rows that became duplicated. > > I am going to try out some of your excellent suggestions. I will report back > on how they are working. > > One idea that was given to me was the following (what do you think Merlin?) > > CREATE TABLE portal.new_metatdata AS > select distinct on (data_object.unique_id) * FROM portal.metadata; sure that will work, but as Michael noted it's not always practical to do that. Also, if a fairly small percentage of the records have to be deleted, an in-place delete may end up being faster anyways. Modern postgres is pretty smart at optimizing 'where exists' and you should get a decent plan. merlin
В списке pgsql-novice по дате отправления: