Re: Bulkloading using COPY - ignore duplicates?
От | Bruce Momjian |
---|---|
Тема | Re: Bulkloading using COPY - ignore duplicates? |
Дата | |
Msg-id | 200201022109.g02L9aW27520@candle.pha.pa.us обсуждение исходный текст |
Ответ на | Re: Bulkloading using COPY - ignore duplicates? (Lee Kindness <lkindness@csl.co.uk>) |
Ответы |
Re: Bulkloading using COPY - ignore duplicates?
|
Список | pgsql-hackers |
Lee Kindness wrote: > Tom Lane writes: > > Lee Kindness <lkindness@csl.co.uk> writes: > > > In an ideal world 'COPY FROM' would only be used with data output by > > > 'COPY TO' and it would be nice and sanitised. However in some fields > > > this often is not a possibility due to performance constraints! > > Of course, the more bells and whistles we add to COPY, the slower it > > will get, which rather defeats the purpose no? > > Indeed, but as I've mentioned in this thread in the past, the code > path for COPY FROM already does a check against the unique index (if > there is one) but bombs-out rather than handling it... > > It wouldn't add any execution time if there were no duplicates in the > input! I know many purists object to allowing COPY to discard invalid rows in COPY input, but it seems we have lots of requests for this feature, with few workarounds except pre-processing the flat file. Of course, if they use INSERT, they will get errors that they can just ignore. I don't see how allowing errors in COPY is any more illegal, except that COPY is one command while multiple INSERTs are separate commands. Seems we need to allow such a capability, if only crudely. I don't think we can create a discard file because of the problem with remote COPY. I think we can allow something like: COPY FROM '/tmp/x' WITH ERRORS 2 meaning we will allow at most two errors and will report the error line numbers to the user. I think this syntax clearly indicates that errors are being accepted in the input. An alternate syntax would allow an unlimited number of errors: COPY FROM '/tmp/x' WITH ERRORS The errors can be non-unique errors, or even CHECK constraint errors. Unless I hear complaints, I will add it to TODO. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
В списке pgsql-hackers по дате отправления: