Re: COPY enhancements
От | Emmanuel Cecchet |
---|---|
Тема | Re: COPY enhancements |
Дата | |
Msg-id | 4AD48758.7090502@frogthinker.org обсуждение исходный текст |
Ответ на | Re: COPY enhancements (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: COPY enhancements
Re: COPY enhancements |
Список | pgsql-hackers |
Tom Lane wrote: > Ultimately, there's always going to be a tradeoff between speed and > flexibility. It may be that we should just say "if you want to import > dirty data, it's gonna cost ya" and not worry about the speed penalty > of subtransaction-per-row. But that still leaves us with the 2^32 > limit. I wonder whether we could break down COPY into sub-sub > transactions to work around that... > Regarding that tradeoff between speed and flexibility I think we could propose multiple options: - maximum speed: current implementation fails on first error - speed with error logging: copy command fails if there is an error but continue to log all errors - speed with error logging best effort: no use of sub-transactions but errors that can safely be trapped with pg_try/catch (no index violation, no before insert trigger, etc...) are logged and command can complete - pre-loading (2-phase copy): phase 1: copy good tuples into a [temp] table and bad tuples into an error table. phase 2: push good tuples to destination table. Note that if phase 2 fails, it could be retried since the temp table would be dropped only on success of phase 2. - slow but flexible: have every row in a sub-transaction -> is there any real benefits compared to pg_loader? Tom was also suggesting 'refactoring COPY into a series of steps that the user can control'. What would these steps be? Would that be per row and allow to discard a bad tuple? Emmanuel -- Emmanuel Cecchet FTO @ Frog Thinker Open Source Development & Consulting -- Web: http://www.frogthinker.org email: manu@frogthinker.org Skype: emmanuel_cecchet
В списке pgsql-hackers по дате отправления: