Bulkloading using COPY - ignore duplicates?
От | Lee Kindness |
---|---|
Тема | Bulkloading using COPY - ignore duplicates? |
Дата | |
Msg-id | 15382.11982.324375.978316@elsick.csl.co.uk обсуждение исходный текст |
Ответ на | Re: Bulkloading using COPY - ignore duplicates? (Lee Kindness <lkindness@csl.co.uk>) |
Ответы |
Re: Bulkloading using COPY - ignore duplicates?
|
Список | pgsql-hackers |
Gents, I started quite a long thread about this back in September. To summarise I was proposing that COPY FROM would not abort the transaction when it encountered data which would cause a uniqueness violation on the table index(s). Generally I think this was seen as a 'Good Thing'TM for a number of reasons: 1. Performance enhancements when doing doing bulk inserts - pre or post processing the data to remove duplicates is very time consuming. Likewise the best tool should always be used for the job at and, and for searching/removing things it's a database. 2. Feature parity with other database systems. For example Oracle's SQLOADER has a feature to not insert duplicates and rather move them to another file for later investigation. Naturally the default behaviour would be the current one of assuming valid data. Also the duplicate check would not add anything to the current code path for COPY FROM - it would not take any longer. I attempted to add this functionality to PostgreSQL myself but got as far as an updated parser and a COPY FROM which resulted in a database recovery! So (here's the question finally) is it worthwhile adding this enhancement to the TODO list? Thanks, Lee. -- Lee Kindness, Senior Software Engineer, Concept Systems Limited.http://services.csl.co.uk/ http://www.csl.co.uk/ +44 1315575595
В списке pgsql-hackers по дате отправления: