Re: Best way to delete unreferenced rows?
От | Craig James |
---|---|
Тема | Re: Best way to delete unreferenced rows? |
Дата | |
Msg-id | 466D6DDE.7030006@emolecules.com обсуждение исходный текст |
Ответ на | Re: Best way to delete unreferenced rows? ("Tyrrill, Ed" <tyrrill_ed@emc.com>) |
Ответы |
Re: Best way to delete unreferenced rows?
Re: Best way to delete unreferenced rows? |
Список | pgsql-performance |
Tyrrill, Ed wrote: > QUERY PLAN > > ------------------------------------------------------------------------ > ------------------------------------------------------------------------ > ------------------- > Merge Left Join (cost=38725295.93..42505394.70 rows=13799645 width=8) > (actual time=6503583.342..8220629.311 rows=93524 loops=1) > Merge Cond: ("outer".record_id = "inner".record_id) > Filter: ("inner".record_id IS NULL) > -> Index Scan using backupobjects_pkey on backupobjects > (cost=0.00..521525.10 rows=13799645 width=8) (actual > time=15.955..357813.621 rows=13799645 loops=1) > -> Sort (cost=38725295.93..39262641.69 rows=214938304 width=8) > (actual time=6503265.293..7713657.750 rows=214938308 loops=1) > Sort Key: backup_location.record_id > -> Seq Scan on backup_location (cost=0.00..3311212.04 > rows=214938304 width=8) (actual time=11.175..1881179.825 rows=214938308 > loops=1) > Total runtime: 8229178.269 ms > (8 rows) > > I ran vacuum analyze after the last time any inserts, deletes, or > updates were done, and before I ran the query above. I've attached my > postgresql.conf. The machine has 4 GB of RAM. I thought maybe someone with more expertise than me might answer this, but since they haven't I'll just make a comment. It looks to me like the sort of 214 million rows is what's killing you. I suppose you could try to increase the sort memory,but that's a lot of memory. It seems to me an index merge of a relation this large would be faster, but that's atopic for the experts. On a theoretical level, the problem is that it's sorting the largest table. Perhaps you could re-cast the query so thatit only has to sort the smaller table, something like select a.id from a where a.id not in (select distinct b.id from b) where "b" is the smaller table. There's still no guarantee that it won't do a sort on "a", though. In fact one of the cleverthings about Postgres is that it can convert a query like the one above into a regular join, unless you do somethinglike "select ... offset 0" which blocks the optimizer from doing the rearrangement. But I think the first approach is to try to tune for a better plan using your original query. Craig
В списке pgsql-performance по дате отправления: