insert only unique values in to a table, ignore rest?
От | George Nychis |
---|---|
Тема | insert only unique values in to a table, ignore rest? |
Дата | |
Msg-id | 45A2B070.9090601@cmu.edu обсуждение исходный текст |
Ответы |
Re: insert only unique values in to a table, ignore rest?
|
Список | pgsql-general |
Hi, I have approximately 2 billion data entries that I would like to insert into a database. Each entry consists of: INT BOOLEAN INT BOOLEAN I want to populate a table such that it only contains the unique rows, all other data should be thrown out. I would say a significant amount of the insertions are going to fail due to unique constraints. The unique constraint is on the two integers, not on the booleans. Using mysql, I was able to do this with the following query, for all data files (25574 data files total): mysql -e \"use connectivity; LOAD DATA INFILE '/path/to/file' IGNORE INTO TABLE ipconn FIELDS TERMINATED BY ' ';\" What I *think* mysql did was sort each data file and do a sort of merge sort between the data I was inserting and the data in the database. It would insert the first unique instance of a row it saw, and reject all other insertions that violated the unique constraint due to the "IGNORE". From what I understand, this functionality is not in postgresql. Fine, I certainly can't change that. But I am looking for a comparable solution for the size of my data. One solution is to have a temporary table, insert all 2 billion rows, and then copy the distinct entries to another table. This would be like one massive sort? Is this the only/best solution using postgresql? Thanks! George
В списке pgsql-general по дате отправления: