need much better query perfomance
От | andy@mixonic.com (Andy) |
---|---|
Тема | need much better query perfomance |
Дата | |
Msg-id | 8d60b5ea.0301241528.230778b2@posting.google.com обсуждение исходный текст |
Ответы |
Re: need much better query perfomance
Re: need much better query perfomance Re: need much better query perfomance |
Список | pgsql-general |
Here's the problem: Table track has 100,000 rows Table album has 20,000 rows Table album_track is a lookup table linking tracks to albums and has 80,000 rows The track table has a PK, a deleted field ('Y' for deleted tracks / null otherwise), and several track-specific fields. The album table also has a PK and a deleted field ('Y' or null), as well as several album-specific fields. The album_track table simple has two columns: The PKs of the album and track tables. A deleted track may still exist in an not-deleted album. A deleted track is simply one that cannot be added to new albums. Likewise, a deleted album may contain not-deleted tracks. It is simply an album that the user has deleted. So even though we've marked deleted albums & tracks in the database, we haven't removed "deleted" tracks from our fileserver. Now we're almost out of space, so we need to find all tracks that are marked as deleted and which are not a part of any non-deleted albums. I wrote what must be a very naive query to find such tracks. Doing an explain on this query gave a huge time estimate. My query is below. Would someone please suggest a faster approach? select t.track_id from track t where t.deleted = 'Y' UNION select track_id from track where track_id not in ( select at.track_id from album_track at, album alb where at.album_id = alb.album_id and alb.deleted is null ) The above query should work, but it takes too long to execute. The sample query below, however, is fast but incorrect: select t.track_id from track t where t.deleted = 'Y' UNION select at.track_id from album_track at, album alb where at.album_id = alb.album_id and alb.deleted = 'Y' This query is incorrect because it the bottom-most select is getting all deleted albums, but a track might be in both deleted AND non-deleted albums. This incorrect query will return all the "completely deleted" tracks, but will also return tracks that are part of both deleted and non-deleted albums, and we don't want this.
В списке pgsql-general по дате отправления: