Copy Bulk Ignore Duplicated
От | Leandro Guimarães |
---|---|
Тема | Copy Bulk Ignore Duplicated |
Дата | |
Msg-id | CAJV35FN-pL3-oYHVDpGUM0coD0zgcF8c+xgcAu+VmUs27JyCmg@mail.gmail.com обсуждение исходный текст |
Ответы |
Re: Copy Bulk Ignore Duplicated
Re: Copy Bulk Ignore Duplicated Re: Copy Bulk Ignore Duplicated |
Список | pgsql-general |
Hi,
I have a scenario with a large table and I'm trying to insert it via a COPY command with a csv file.
Everything works, but sometimes my source .csv file has duplicated data in the previously fulfilled table. If I add a check constraint and try to run the COPY command I have an error that stops the whole insertion.
I've tried to put the data in a tmp table and fill the main using distinct this way (the fields and names are just examples):
INSERT INTO final_table values (name, document)
SELECT DISTINCT name, document
FROM tmp_TABLE t1
WHERE NOT EXISTS (
SELECT 1 FROM final_table t2
WHERE (t2.name, t2.document)
IS NOT DISTINCT FROM (t1.name, t1.document))
SELECT DISTINCT name, document
FROM tmp_TABLE t1
WHERE NOT EXISTS (
SELECT 1 FROM final_table t2
WHERE (t2.name, t2.document)
IS NOT DISTINCT FROM (t1.name, t1.document))
The problem is that my final_table is a large (and partitioned) table and this query is taking a long time to execute.
Someone have any idea (really guys anything would be great) how to solve this situation? I need to ignore duplicates instead to have some error.
I'm using PostgreSQL 9.4 so I can't use "ON CONFLICT" and upgrade is not an option.
Thanks and Kind Regards!
Leandro Guimarães
В списке pgsql-general по дате отправления: