Re: POC: postgres_fdw insert batching
От | Ashutosh Bapat |
---|---|
Тема | Re: POC: postgres_fdw insert batching |
Дата | |
Msg-id | CAG-ACPX4a=bfVS4Tc8NPCL9fu1sQNotsAVXdq-x+FD9davJHpQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: POC: postgres_fdw insert batching (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
Список | pgsql-hackers |
On Tue, 30 Jun 2020 at 22:23, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
>I didn't get this. We are executing an INSERT on the foreign server,
>so we get the number of rows INSERTed from that server. We should just
>add those up across batches. If there's a failure, it would abort the
>transaction, local as well as remote.
>
True, but it's not the FDW code doing the counting - it's the caller,
depending on whether the ExecForeignInsert returns a valid slot or NULL.
So it's not quite possible to just return a number of inserted tuples,
as returned by the remote server.
Hmm yes, now I remember that bit. So for every row buffered, we return a valid slot without knowing whether that row was inserted on the remote server or not. I think we have that problem even now where a single INSERT might result in multiple INSERTs on the remote server (rare but not completely impossible).
>In your patch, I see that an INSERT statement with batch is
>constructed as INSERT INTO ... VALUES (...), (...) as many values as
>the batch size. That won't work as is for UPDATE/DELETE since we can't
>pass multiple pairs of ctids and columns to be updated for each ctid
>in one statement. Maybe we could build as many UPDATE/DELETE
>statements as the size of a batch, but that would be ugly. What we
>need is a feature like a batch prepared statement in libpq similar to
>what JDBC supports
>((https://mkyong.com/jdbc/jdbc-preparedstatement-example-batch-update/).
>This will allow a single prepared statement to be executed with a
>batch of parameters, each batch corresponding to one foreign DML
>statement.
>
I'm pretty sure we could make it work with some array/unnest tricks to
build a relation, and use that as a source of data.
That sounds great. The solution will be limited to postgres_fdw only.
I don't see why not support both, the use cases are somewhat different I
think.
+1, if we can do both.
Best Wishes,
Ashutosh
В списке pgsql-hackers по дате отправления: