Re: Deleting orphaned records to establish Ref Integrity
От | Tom Lane |
---|---|
Тема | Re: Deleting orphaned records to establish Ref Integrity |
Дата | |
Msg-id | 10442.1117694169@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Deleting orphaned records to establish Ref Integrity ("Roman F" <romanf@fusemail.com>) |
Ответы |
Re: Deleting orphaned records to establish Ref Integrity
|
Список | pgsql-general |
"Roman F" <romanf@fusemail.com> writes: > ... Executing something like > the following statement would work, but even with indexes it takes an > insane amount of time to execute for each of the tables: > DELETE FROM child_table WHERE parentid NOT IN > (SELECT parentid FROM parent_table) Uh, what sort of query plan are you getting for that? PG 7.4 and up can do a reasonable job with NOT IN if the sub-select is small enough to fit into an in-memory hash table (of size sort_mem). I'm betting that your sort_mem setting is not high enough to encourage the planner to try the hash method. You could try increasing sort_mem ... but given the size of your tables, you might end up with a hash table large enough to drive the system into swapping, in which case it'll still be mighty slow. Another idea is to try an outer join: SELECT child_table.parentid INTO tmp_table FROM child_table LEFT JOIN parent_table ON (child_table.parentid = parent_table.parentid) WHERE parent_table.parentid IS NULL; which essentially does a join and then pulls out just the child_table rows that failed to match. This will probably end up getting done via a merge join or hybrid hash join, either of which are more scalable than the NOT IN code. You still have to do the actual deletions in child_table, but as long as there aren't too many, a NOT IN using tmp_table should work OK. regards, tom lane
В списке pgsql-general по дате отправления: