Adding a foreign key constraint is extremely slow
От | bsamwel@xs4all.nl |
---|---|
Тема | Adding a foreign key constraint is extremely slow |
Дата | |
Msg-id | 20505.194.109.187.67.1048442304.squirrel@webmail.xs4all.nl обсуждение исходный текст |
Ответы |
Re: Adding a foreign key constraint is extremely slow
Re: Adding a foreign key constraint is extremely slow |
Список | pgsql-performance |
Hi guys, I'm having another performance problem as well. I have two tables called "wwwlog" (about 100 bytes per row, 1.2 million records) and table called "triples" (about 20 bytes per row, 0.9 million records). Triples contains an integer foreign key to wwwlog, but it was not marked as a foreign key at the point of table creation. Now, when I do: alter table triples add foreign key(id1) references wwwlog(id); PostgreSQL starts doing heavy work for at least one and a half hour, and I broke it off at that. It is not possible to "explain" a statement like this! Probably what it does is that it will check the foreign key constraint for every field in the table. This will make it completely impossible to load my data, because: (1) I cannot set the foreign key constraints BEFORE loading the 0.9 million records, because that would cause the checks to take place during loading. (2) I cannot set the foreign key constraints AFTER loading the 0.9 million records because I've got no clue at all how long this operation is going to take. (3) Table "triples" contains two more foreign keys to the same wwwlog key. This means I've got to do the same thing two more times after the first one is finished. I find this behaviour very annoying, because it is possible to optimize a check like this very well, for instance by creating a temporary data set containing the union of all foreign keys and all primary keys of the original table, augmented with an extra field "pri" which is 1 if the record comes from the primary keys and 0 otherwise. Say this data is contained in a temporary table called "t" with columns "key" and "pri" for the data. One would then be able to do the check like this: NOT EXISTS( SELECT key,sum(pri) FROM t GROUP BY key HAVING sum(pri) = 0 ); This means that there must not exist a group of "key" values that does not have a primary key somewhere in the set. This query is extremely easy to execute and would be done in a few seconds. Does anyone know of a way of adding foreign key constraints faster in PostgreSQL? Or, if there is no solution, do you guys know of any reasons why a solution like the one I described above would or would not work, and could or could not be built into PostgreSQL at some point? Regards, Bart
В списке pgsql-performance по дате отправления: