(2018/03/23 21:02), Robert Haas wrote:
> On Fri, Mar 23, 2018 at 7:55 AM, Etsuro Fujita
> <fujita.etsuro@lab.ntt.co.jp> wrote:
>> Maybe I'm missing something, but I think the proposed FDW API could be used
>> for the COPY case as well with some modifications to core. If so, my
>> question is: should we support COPY into foreign tables as well? I think
>> that if we support COPY tuple routing for foreign partitions, it would be
>> better to support direct COPY into foreign partitions as well.
>
> Yes, I really, really want to be able to support COPY. If you think
> you can make that work with this API -- let's see it!
OK
>>> 3. It looks like we're just doing an INSERT for every row, which is
>>> pretty much an anti-pattern for inserting data into a PostgreSQL
>>> database. COPY is way faster, and even multi-row inserts are
>>> significantly faster.
>>
>> I planed to work on new FDW API for using COPY for COPY tuple routing [1],
>> but I didn't have time for that in this development cycle, so I'm
>> re-planning to work on that for PG12. I'm not sure we can optimize that
>> insertion using multi-row inserts because tuple routing works row by row, as
>> you know. Anyway, I think these would be beyond the scope of the first
>> version.
>
> My concern is that if we add APIs now that only support single-row
> inserts, we'll have to rework them again in order to support multi-row
> inserts. I'd like to avoid that, if possible.
Yeah, but we would have this issue for normal inserts into foreign
tables. For the normal case, what I'm thinking to support multi-row
inserts is to use DirectModify FDW API. With this API, I think we could
support pushing down INSERT with multiple VALUES sublists to the remote,
but I'm not sure we could extend that to the tuple-routing case.
> I think for bulk
> inserts we'll need an API that says "here's a row, store it or buffer
> it as you like" and then another API that says "flush any buffered
> rows to the actual table and perform any necessary cleanup". Or maybe
> in postgres_fdw the first API could start a COPY if not already done
> and send the row, and the second one could end the COPY.
That's really what I have in mind.
Best regards,
Etsuro Fujita