Re: Removing duplicate entries
От | Michael Fuhr |
---|---|
Тема | Re: Removing duplicate entries |
Дата | |
Msg-id | 20060112165933.GA496@winnie.fuhr.org обсуждение исходный текст |
Ответ на | Removing duplicate entries ("Scott Ford" <Scott.Ford@bullfrogpower.com>) |
Список | pgsql-novice |
[Please copy the mailing list on replies.] On Thu, Jan 12, 2006 at 10:24:10AM -0500, Scott Ford wrote: > That is exactly the result that I'm looking for. But won't that > actually remove the documents with doc_type_id where there is only one > for a given customer too? No, it shouldn't (see below). > Maybe I need to read up on my GROUP BY clause. > > I thought that: > > SELECT min(doc_id) > FROM doc > WHERE doc_type_id = 2 -- not necessary but probably more efficient > GROUP BY cust_id, doc_type_id > > would select one instance of doc_type_id = 2 from each customer if there > was one, and would select the one with the lowest doc_id if there was > more then one. Right. This subquery generates the list of doc_id's *not* to delete (you could use max instead of min; the point is to choose exactly one out of the set). The WHERE clause in the subquery isn't necessary because the DELETE already restricts the affected records to those with doc_type_id 2; however, adding the restriction to the subquery generates a smaller list of doc_id's to check against, which might make the delete faster. In other words, the delete statement says "delete all records with doc_type_id 2 except for these", where "these" is the list generated by the subquery. "These" can contain but doesn't need to contain doc_id's for documents with a doc_type_id other than 2 because we've already said they're not eligible for deletion. > So then wouldn't the row: > > doc_id | cust_id | doc_type_id > --------+---------+------------- > 9 | 2 | 2 > > get removed too? No, as the example I posted shows (the output came from sample data and actually running the queries shown). > I'll test this out some test data in the meant time. Please do: it's important that you understand how the delete works and be satisfied that it *does* work. As I mentioned before, I might be misunderstanding your requirements or making unwarranted assumptions about your data; ultimately it's up to you to determine whether the delete works or not. -- Michael Fuhr
В списке pgsql-novice по дате отправления: