Re: How to "unique-ify" HUGE table?
От | Scott Marlowe |
---|---|
Тема | Re: How to "unique-ify" HUGE table? |
Дата | |
Msg-id | dcc563d10812230934y613ec899sffe5483e87c93cfd@mail.gmail.com обсуждение исходный текст |
Ответ на | How to "unique-ify" HUGE table? ("Kynn Jones" <kynnjo@gmail.com>) |
Список | pgsql-performance |
On Tue, Dec 23, 2008 at 10:25 AM, Kynn Jones <kynnjo@gmail.com> wrote: > Hi everyone! > I have a very large 2-column table (about 500M records) from which I want to > remove duplicate records. > I have tried many approaches, but they all take forever. > The table's definition consists of two short TEXT columns. It is a > temporary table generated from a query: > > CREATE TEMP TABLE huge_table AS SELECT x, y FROM ... ; > Initially I tried > CREATE TEMP TABLE huge_table AS SELECT DISTINCT x, y FROM ... ; > but after waiting for nearly an hour I aborted the query, and repeated it > after getting rid of the DISTINCT clause. > Everything takes forever with this monster! It's uncanny. Even printing it > out to a file takes forever, let alone creating an index for it. > Any words of wisdom on how to speed this up would be appreciated. Did you try cranking up work_mem to something that's a large percentage (25 to 50%) of total memory?
В списке pgsql-performance по дате отправления: