Обсуждение: Fdw batch insert error out when set batch_size > 65535

Поиск
Список
Период
Сортировка

Fdw batch insert error out when set batch_size > 65535

От
"houzj.fnst@fujitsu.com"
Дата:
Hi,

When reading the code, I noticed some possible issue about fdw batch insert.
When I set batch_size > 65535 and insert more than 65535 rows into foreign table, 
It will throw an error:

For example:

------------------
CREATE FOREIGN TABLE vzdalena_tabulka2(a int, b varchar)
  SERVER omega_db
  OPTIONS (table_name 'tabulka', batch_size '65536');

INSERT INTO vzdalena_tabulka2 SELECT i, 'AHOJ' || i FROM generate_series(1,65536) g(i);

ERROR:  number of parameters must be between 0 and 65535
CONTEXT:  remote SQL command: INSERT INTO public.tabulka(a, b) VALUES ($1, $2), ($3, $4)...
------------------

Actually, I think if the (number of columns) * (number of rows) > 65535, then we can
get this error. But, I think almost no one will set such a large value, so I think adjust the
batch_size automatically when doing INSERT seems an acceptable solution.

In the postgresGetForeignModifyBatchSize(), if we found the (num of param) * batch_size
Is greater than the limit(65535), then we set it to 65535 / (num of param).

Thoughts ?

Best regards,
houzj



Вложения

Re: Fdw batch insert error out when set batch_size > 65535

От
Bharath Rupireddy
Дата:
On Fri, May 21, 2021 at 8:18 AM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
>
> Hi,
>
> When reading the code, I noticed some possible issue about fdw batch insert.
> When I set batch_size > 65535 and insert more than 65535 rows into foreign table,
> It will throw an error:
>
> For example:
>
> ------------------
> CREATE FOREIGN TABLE vzdalena_tabulka2(a int, b varchar)
>   SERVER omega_db
>   OPTIONS (table_name 'tabulka', batch_size '65536');
>
> INSERT INTO vzdalena_tabulka2 SELECT i, 'AHOJ' || i FROM generate_series(1,65536) g(i);
>
> ERROR:  number of parameters must be between 0 and 65535
> CONTEXT:  remote SQL command: INSERT INTO public.tabulka(a, b) VALUES ($1, $2), ($3, $4)...
> ------------------
>
> Actually, I think if the (number of columns) * (number of rows) > 65535, then we can
> get this error. But, I think almost no one will set such a large value, so I think adjust the
> batch_size automatically when doing INSERT seems an acceptable solution.
>
> In the postgresGetForeignModifyBatchSize(), if we found the (num of param) * batch_size
> Is greater than the limit(65535), then we set it to 65535 / (num of param).
>
> Thoughts ?

+1 to limit batch_size for postgres_fdw and it's a good idea to have a
macro for the max params.

Few comments:
1) How about using macro in the error message, something like below?
        appendPQExpBuffer(errorMessage,
                          libpq_gettext("number of parameters must be
between 0 and %d\n"),
                          PQ_MAX_PARAM_NUMBER);
2) I think we can choose not mention the 65535 because it's hard to
maintain if that's changed in libpq protocol sometime in future. How
about
"The final number of rows postgres_fdw inserts in a batch actually
depends on the number of columns and the provided batch_size value.
This is because of the limit the libpq protocol (which postgres_fdw
uses to connect to a remote server) has on the number of query
parameters that can be specified per query. For instance, if the
number of columns * batch_size is more than the limit, then the libpq
emits an error. But postgres_fdw adjusts the batch_size to avoid this
error."
instead of
+       overrides an option specified for the server. Note if the batch size
+       exceed the protocol limit (number of columns * batch_size > 65535),
+       then the actual batch size will be less than the specified batch_size.
3) I think "postgres_fdw should insert in each insert operation"
doesn't hold after this  patch. We can reword it to "postgres_fdw
inserts in each insert operation".
       This option specifies the number of rows
<filename>postgres_fdw</filename>
       should insert in each insert operation. It can be specified for a
4) How about naming the macro as PQ_QUERY_PARAM_MAX_LIMIT?
5) We can use macro in code comments as well.
+     * 65535, so set the batch_size to not exceed limit in a batch insert.
6) I think both code and docs patches can be combined into a single patch.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com



RE: Fdw batch insert error out when set batch_size > 65535

От
"houzj.fnst@fujitsu.com"
Дата:
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Sent: Friday, May 21, 2021 1:42 PM
> On Fri, May 21, 2021 at 8:18 AM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
> > Actually, I think if the (number of columns) * (number of rows) >
> > 65535, then we can get this error. But, I think almost no one will set
> > such a large value, so I think adjust the batch_size automatically when doing
> INSERT seems an acceptable solution.
> >
> > In the postgresGetForeignModifyBatchSize(), if we found the (num of
> > param) * batch_size Is greater than the limit(65535), then we set it to 65535 /
> (num of param).
> >
> > Thoughts ?
> 
> +1 to limit batch_size for postgres_fdw and it's a good idea to have a
> macro for the max params.
>
> Few comments:

Thanks for the comments. 

> 1) How about using macro in the error message, something like below?
>         appendPQExpBuffer(errorMessage,
>                           libpq_gettext("number of parameters must be
> between 0 and %d\n"),
>                           PQ_MAX_PARAM_NUMBER);

Changed.

> 2) I think we can choose not mention the 65535 because it's hard to maintain if
> that's changed in libpq protocol sometime in future. How about "The final
> number of rows postgres_fdw inserts in a batch actually depends on the
> number of columns and the provided batch_size value.
> This is because of the limit the libpq protocol (which postgres_fdw uses to
> connect to a remote server) has on the number of query parameters that can
> be specified per query. For instance, if the number of columns * batch_size is
> more than the limit, then the libpq emits an error. But postgres_fdw adjusts the
> batch_size to avoid this error."
> instead of
> +       overrides an option specified for the server. Note if the batch size
> +       exceed the protocol limit (number of columns * batch_size > 65535),
> +       then the actual batch size will be less than the specified batch_size.

Thanks, your description looks better. Changed.

> 3) I think "postgres_fdw should insert in each insert operation"
> doesn't hold after this  patch. We can reword it to "postgres_fdw inserts in
> each insert operation".
>        This option specifies the number of rows
> <filename>postgres_fdw</filename>
>        should insert in each insert operation. It can be specified for a

Changed.

> 4) How about naming the macro as PQ_QUERY_PARAM_MAX_LIMIT?

Changed.

> 5) We can use macro in code comments as well.

Thanks, I reorganized the code comments.

> +     * 65535, so set the batch_size to not exceed limit in a batch insert.
> 6) I think both code and docs patches can be combined into a single patch.
Combined.

Attaching V2 patch. Please consider it for further review.

Best regards,
houzj

Вложения

Re: Fdw batch insert error out when set batch_size > 65535

От
Bharath Rupireddy
Дата:
On Fri, May 21, 2021 at 1:19 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
> Attaching V2 patch. Please consider it for further review.

Thanks for the patch. Some more comments:

1) Can fmstate->p_nums ever be 0 in postgresGetForeignModifyBatchSize?
By any chance, if it can, I think instead of an assert, we can have
something like below, since we are using it in the division.
    if (fmstate->p_nums > 0 &&
       (batch_size * fmstate->p_nums > PQ_QUERY_PARAM_MAX_LIMIT))
    {
        batch_size = PQ_QUERY_PARAM_MAX_LIMIT / fmstate->p_nums;
    }
Also, please remove the { and } for the above if condition, because
for 1 line statements we don't normally use { and }
2) An empty line after the macro definition will be good.
+#define PQ_QUERY_PARAM_MAX_LIMIT 65535
 extern int    PQsendQuery(PGconn *conn, const char *query);
3) I think we use:
<filename>postgres_fdw</filename> not postgres_fdw
<literal>batch_size</literal> not batch_size
the number of columns * <literal>batch_size</literal> not the number
of columns * batch_size
+       overrides an option specified for the server. Note the final number
+       of rows postgres_fdw inserts in a batch actually depends on the
+       number of columns and the provided batch_size value. This is because
+       of the limit the libpq protocol (which postgres_fdw uses to connect
+       to a remote server) has on the number of query parameters that can
+       be specified per query. For instance, if the number of columns
* batch_size
+       is more than the limit, then the libpq emits an error. But postgres_fdw
+       adjusts the batch_size to avoid this error.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com



Re: Fdw batch insert error out when set batch_size > 65535

От
Andrew Dunstan
Дата:
On 5/21/21 5:03 AM, Bharath Rupireddy wrote:
> On Fri, May 21, 2021 at 1:19 PM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
>> Attaching V2 patch. Please consider it for further review.
> Thanks for the patch. Some more comments:
>
> 1) Can fmstate->p_nums ever be 0 in postgresGetForeignModifyBatchSize?
> By any chance, if it can, I think instead of an assert, we can have
> something like below, since we are using it in the division.
>     if (fmstate->p_nums > 0 &&
>        (batch_size * fmstate->p_nums > PQ_QUERY_PARAM_MAX_LIMIT))
>     {
>         batch_size = PQ_QUERY_PARAM_MAX_LIMIT / fmstate->p_nums;
>     }
> Also, please remove the { and } for the above if condition, because
> for 1 line statements we don't normally use { and }



We used to filter that out in pgindent IIRC but we don't any more.
IMNSHO there are cases when it makes the code more readable, especially
if (as here) the condition spans more than one line. I also personally
dislike having one branch of an "if" statement with braces and another
without - it looks far better to my eyes to have all or none with
braces. But I realize that's a matter of taste, and there are plenty of
examples in the code running counter to my taste here.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com




RE: Fdw batch insert error out when set batch_size > 65535

От
"houzj.fnst@fujitsu.com"
Дата:
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Sent: Friday, May 21, 2021 5:03 PM
> On Fri, May 21, 2021 at 1:19 PM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
> > Attaching V2 patch. Please consider it for further review.
> 
> Thanks for the patch. Some more comments:
> 
> 1) Can fmstate->p_nums ever be 0 in postgresGetForeignModifyBatchSize?
> By any chance, if it can, I think instead of an assert, we can have something like
> below, since we are using it in the division.
>     if (fmstate->p_nums > 0 &&
>        (batch_size * fmstate->p_nums > PQ_QUERY_PARAM_MAX_LIMIT))
>     {
>         batch_size = PQ_QUERY_PARAM_MAX_LIMIT / fmstate->p_nums;
>     }
> Also, please remove the { and } for the above if condition, because for 1 line
> statements we don't normally use { and }
> 2) An empty line after the macro definition will be good.
> +#define PQ_QUERY_PARAM_MAX_LIMIT 65535
>  extern int    PQsendQuery(PGconn *conn, const char *query);
> 3) I think we use:
> <filename>postgres_fdw</filename> not postgres_fdw
> <literal>batch_size</literal> not batch_size the number of columns *
> <literal>batch_size</literal> not the number of columns * batch_size
> +       overrides an option specified for the server. Note the final number
> +       of rows postgres_fdw inserts in a batch actually depends on the
> +       number of columns and the provided batch_size value. This is because
> +       of the limit the libpq protocol (which postgres_fdw uses to connect
> +       to a remote server) has on the number of query parameters that can
> +       be specified per query. For instance, if the number of columns
> * batch_size
> +       is more than the limit, then the libpq emits an error. But postgres_fdw
> +       adjusts the batch_size to avoid this error.

Thanks for the comments. I have addressed all comments to the v3 patch.
BTW, Is the batch_size issue here an Open Item of PG14 ?

Best regards,
houzj

Вложения

Re: Fdw batch insert error out when set batch_size > 65535

От
Bharath Rupireddy
Дата:
On Tue, May 25, 2021 at 1:08 PM houzj.fnst@fujitsu.com
<houzj.fnst@fujitsu.com> wrote:
> Thanks for the comments. I have addressed all comments to the v3 patch.

Thanks! The patch basically looks good to me except that it is missing
a commit message. I think it can be added now.

> BTW, Is the batch_size issue here an Open Item of PG14 ?

IMO, the issue you found when setting batch_size to a too high value
is an extreme case testing of the feature added by commit b663a4136
that introduced the batch_size parameter. So, it's a bug to me. I
think you can add it as a bug in the commitfest and let the committers
take the call.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com



Re: Fdw batch insert error out when set batch_size > 65535

От
Bharath Rupireddy
Дата:
On Tue, May 25, 2021 at 2:47 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
>
> On Tue, May 25, 2021 at 1:08 PM houzj.fnst@fujitsu.com
> <houzj.fnst@fujitsu.com> wrote:
> > Thanks for the comments. I have addressed all comments to the v3 patch.
>
> Thanks! The patch basically looks good to me except that it is missing
> a commit message. I think it can be added now.

With v3 patch, I observed failure in postgres_fdw test cases with
insert query in prepared statements. Root cause is that in
postgresGetForeignModifyBatchSize, fmstate can be null (see the
existing code which handles for fmstate null cases). I fixed this, and
added a commit message. PSA v4 patch.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Вложения

Re: Fdw batch insert error out when set batch_size > 65535

От
Tomas Vondra
Дата:
On 5/26/21 8:57 AM, Bharath Rupireddy wrote:
> On Tue, May 25, 2021 at 2:47 PM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
>>
>> On Tue, May 25, 2021 at 1:08 PM houzj.fnst@fujitsu.com
>> <houzj.fnst@fujitsu.com> wrote:
>>> Thanks for the comments. I have addressed all comments to the v3 patch.
>>
>> Thanks! The patch basically looks good to me except that it is missing
>> a commit message. I think it can be added now.
> 
> With v3 patch, I observed failure in postgres_fdw test cases with
> insert query in prepared statements. Root cause is that in
> postgresGetForeignModifyBatchSize, fmstate can be null (see the
> existing code which handles for fmstate null cases). I fixed this, and
> added a commit message. PSA v4 patch.
> 

Thanks. In what situation is the fmstate NULL? If it is NULL, the 
current code simply skips the line adjusting it. Doesn't that mean we 
may not actually fix the bug in that case?

Also, I think it'd be keep the existing comment, probably as the first 
line of the new comment block.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Fdw batch insert error out when set batch_size > 65535

От
Bharath Rupireddy
Дата:
On Wed, May 26, 2021 at 6:36 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
>
> On 5/26/21 8:57 AM, Bharath Rupireddy wrote:
> > On Tue, May 25, 2021 at 2:47 PM Bharath Rupireddy
> > <bharath.rupireddyforpostgres@gmail.com> wrote:
> >>
> >> On Tue, May 25, 2021 at 1:08 PM houzj.fnst@fujitsu.com
> >> <houzj.fnst@fujitsu.com> wrote:
> >>> Thanks for the comments. I have addressed all comments to the v3 patch.
> >>
> >> Thanks! The patch basically looks good to me except that it is missing
> >> a commit message. I think it can be added now.
> >
> > With v3 patch, I observed failure in postgres_fdw test cases with
> > insert query in prepared statements. Root cause is that in
> > postgresGetForeignModifyBatchSize, fmstate can be null (see the
> > existing code which handles for fmstate null cases). I fixed this, and
> > added a commit message. PSA v4 patch.
> >
>
> Thanks. In what situation is the fmstate NULL? If it is NULL, the
> current code simply skips the line adjusting it. Doesn't that mean we
> may not actually fix the bug in that case?

fmstate i.e. resultRelInfo->ri_FdwState is NULL for EXPLAIN without
ANALYZE cases, below comment says it and we can't get the bug because
we don't actually execute the insert statement. The bug occurs on the
remote server when the insert query with those many query parameters
is submitted to the remote server. I'm not sure if there are any other
cases where it can be NULL.
    /*
     * In EXPLAIN without ANALYZE, ri_fdwstate is NULL, so we have to lookup
     * the option directly in server/table options. Otherwise just use the
     * value we determined earlier.
     */
    if (fmstate)
        batch_size = fmstate->batch_size;
    else
        batch_size = get_batch_size_option(resultRelInfo->ri_RelationDesc);

> Also, I think it'd be keep the existing comment, probably as the first
> line of the new comment block.

Do you mean to say we need to retain "/* Otherwise use the batch size
specified for server/table. */"? If so, PSA v5.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Вложения

RE: Fdw batch insert error out when set batch_size > 65535

От
"houzj.fnst@fujitsu.com"
Дата:
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Sent: Wednesday, May 26, 2021 9:56 PM
> On Wed, May 26, 2021 at 6:36 PM Tomas Vondra
> <tomas.vondra@enterprisedb.com> wrote:
> >
> > On 5/26/21 8:57 AM, Bharath Rupireddy wrote:
> > > On Tue, May 25, 2021 at 2:47 PM Bharath Rupireddy
> > > <bharath.rupireddyforpostgres@gmail.com> wrote:
> > >>
> > >> On Tue, May 25, 2021 at 1:08 PM houzj.fnst@fujitsu.com
> > >> <houzj.fnst@fujitsu.com> wrote:
> > >>> Thanks for the comments. I have addressed all comments to the v3
> patch.
> > >>
> > >> Thanks! The patch basically looks good to me except that it is
> > >> missing a commit message. I think it can be added now.
> > >
> > > With v3 patch, I observed failure in postgres_fdw test cases with
> > > insert query in prepared statements. Root cause is that in
> > > postgresGetForeignModifyBatchSize, fmstate can be null (see the
> > > existing code which handles for fmstate null cases). I fixed this,
> > > and added a commit message. PSA v4 patch.
> > >
> >
> > Thanks. In what situation is the fmstate NULL? If it is NULL, the
> > current code simply skips the line adjusting it. Doesn't that mean we
> > may not actually fix the bug in that case?
> 
> fmstate i.e. resultRelInfo->ri_FdwState is NULL for EXPLAIN without ANALYZE
> cases, below comment says it and we can't get the bug because we don't
> actually execute the insert statement. The bug occurs on the remote server
> when the insert query with those many query parameters is submitted to the
> remote server.

Agreed.
The "ri_FdwState" is initialized in postgresBeginForeignInsert or postgresBeginForeignModify.
I think the above functions are always invoked before getting the batch_size.

Only in EXPLAIN mode, it will not initialize the ri_FdwState.

    /*
     * Do nothing in EXPLAIN (no ANALYZE) case.  resultRelInfo->ri_FdwState
     * stays NULL.
     */
    if (eflags & EXEC_FLAG_EXPLAIN_ONLY)
        return;

Best regards,
houzj
 


Re: Fdw batch insert error out when set batch_size > 65535

От
Tomas Vondra
Дата:
Hi,

I took at this patch today. I did some minor changes, mostly:

1) change the code limiting batch_size from

    if (fmstate->p_nums > 0 &&
       (batch_size * fmstate->p_nums > PQ_QUERY_PARAM_MAX_LIMIT))
    {
        batch_size = PQ_QUERY_PARAM_MAX_LIMIT / fmstate->p_nums;
    }

to

    if (fmstate && fmstate->p_nums > 0)
        batch_size = Min(batch_size,
                         PQ_QUERY_PARAM_MAX_LIMIT / fmstate->p_nums);

which I think is somewhat clearer / more common patter.


2) I've reworded the docs a bit, splitting the single para into two. I
think this makes it clearer.


Attached is a patch doing this. Please check the commit message etc.
Barring objections I'll get it committed in a couple days.



regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Вложения

Re: Fdw batch insert error out when set batch_size > 65535

От
Bharath Rupireddy
Дата:
On Mon, May 31, 2021 at 1:21 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
>
> Hi,
>
> I took at this patch today. I did some minor changes, mostly:
>
> 1) change the code limiting batch_size from
>
>     if (fmstate->p_nums > 0 &&
>        (batch_size * fmstate->p_nums > PQ_QUERY_PARAM_MAX_LIMIT))
>     {
>         batch_size = PQ_QUERY_PARAM_MAX_LIMIT / fmstate->p_nums;
>     }
>
> to
>
>     if (fmstate && fmstate->p_nums > 0)
>         batch_size = Min(batch_size,
>                          PQ_QUERY_PARAM_MAX_LIMIT / fmstate->p_nums);
>
> which I think is somewhat clearer / more common patter.

Agree, that looks pretty good.

> 2) I've reworded the docs a bit, splitting the single para into two. I
> think this makes it clearer.

LGTM, except one thing that the batch_size description says "should
insert in", but it's not that the value entered for batch_size is
always honoured right? Because this patch might it.

       This option specifies the number of rows
<filename>postgres_fdw</filename>
       should insert in each insert operation. It can be specified for a

So, I suggest to remove "should" and change it to:

       This option specifies the number of rows
<filename>postgres_fdw</filename>
       inserts in each insert operation. It can be specified for a

> Attached is a patch doing this. Please check the commit message etc.
> Barring objections I'll get it committed in a couple days.

One minor comment:
In the commit message, Int16 is used
The FE/BE protocol identifies parameters with an Int16 index, which
limits the maximum number of parameters per query to 65535. With

and in the code comments uint16 is used.
+     * parameters in a batch is limited to 64k (uint16), so make sure we don't

Isn't it uint16 in the commit message too? Also, can we use 64k in the
commit message instead of 65535?

With Regards,
Bharath Rupireddy.



Re: Fdw batch insert error out when set batch_size > 65535

От
Tomas Vondra
Дата:

On 5/31/21 6:01 AM, Bharath Rupireddy wrote:
> On Mon, May 31, 2021 at 1:21 AM Tomas Vondra
> <tomas.vondra@enterprisedb.com> wrote:
>>
>> Hi,
>>
>> I took at this patch today. I did some minor changes, mostly:
>>
>> 1) change the code limiting batch_size from
>>
>>     if (fmstate->p_nums > 0 &&
>>        (batch_size * fmstate->p_nums > PQ_QUERY_PARAM_MAX_LIMIT))
>>     {
>>         batch_size = PQ_QUERY_PARAM_MAX_LIMIT / fmstate->p_nums;
>>     }
>>
>> to
>>
>>     if (fmstate && fmstate->p_nums > 0)
>>         batch_size = Min(batch_size,
>>                          PQ_QUERY_PARAM_MAX_LIMIT / fmstate->p_nums);
>>
>> which I think is somewhat clearer / more common patter.
> 
> Agree, that looks pretty good.
> 
>> 2) I've reworded the docs a bit, splitting the single para into two. I
>> think this makes it clearer.
> 
> LGTM, except one thing that the batch_size description says "should
> insert in", but it's not that the value entered for batch_size is
> always honoured right? Because this patch might it.
> 
>        This option specifies the number of rows
> <filename>postgres_fdw</filename>
>        should insert in each insert operation. It can be specified for a
> 
> So, I suggest to remove "should" and change it to:
> 
>        This option specifies the number of rows
> <filename>postgres_fdw</filename>
>        inserts in each insert operation. It can be specified for a
> 

I think the "should" indicates exactly that postgres_fdw may adjust the
batch size. Without it it sounds much more definitive, so I kept it.

>> Attached is a patch doing this. Please check the commit message etc.
>> Barring objections I'll get it committed in a couple days.
> 
> One minor comment:
> In the commit message, Int16 is used
> The FE/BE protocol identifies parameters with an Int16 index, which
> limits the maximum number of parameters per query to 65535. With
> 
> and in the code comments uint16 is used.
> +     * parameters in a batch is limited to 64k (uint16), so make sure we don't
> 
> Isn't it uint16 in the commit message too? Also, can we use 64k in the
> commit message instead of 65535?
> 

No, the "Int16" refers to the FE/BE documentation, where we use Int16.
But in the C code we interpret it as uint16.

I've added a simple regression test to postgres_fdw, testing that batch
sizes > 65535 work fine, and pushed the fix.


I've considered checking the value in postgres_fdw_validator and just
rejecting anything over 65535, but I've decided against that. We'd still
need to adjust depending on number of columns anyway.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Fdw batch insert error out when set batch_size > 65535

От
Bharath Rupireddy
Дата:
On Wed, Jun 9, 2021 at 12:04 AM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
> No, the "Int16" refers to the FE/BE documentation, where we use Int16.
> But in the C code we interpret it as uint16.

Hm. I see that in protocol.sgml Int16 is being used.

> I've added a simple regression test to postgres_fdw, testing that batch
> sizes > 65535 work fine, and pushed the fix.

I was earlier thinking of adding one, but stopped because it might
increase the regression test execution time. It looks like that's true
- with and without the test case it takes 17 sec and 4 sec
respectively on my dev system which is 4X slower. I'm not sure if this
is okay.

With Regards,
Bharath Rupireddy.



Re: Fdw batch insert error out when set batch_size > 65535

От
Tom Lane
Дата:
Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> writes:
>> I've added a simple regression test to postgres_fdw, testing that batch
>> sizes > 65535 work fine, and pushed the fix.

> I was earlier thinking of adding one, but stopped because it might
> increase the regression test execution time. It looks like that's true
> - with and without the test case it takes 17 sec and 4 sec
> respectively on my dev system which is 4X slower. I'm not sure if this
> is okay.

The cost, versus the odds of ever detecting a problem, doesn't
seem like a good tradeoff.

            regards, tom lane



Re: Fdw batch insert error out when set batch_size > 65535

От
Tom Lane
Дата:
I wrote:
> Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> writes:
>>> I've added a simple regression test to postgres_fdw, testing that batch
>>> sizes > 65535 work fine, and pushed the fix.

>> I was earlier thinking of adding one, but stopped because it might
>> increase the regression test execution time. It looks like that's true
>> - with and without the test case it takes 17 sec and 4 sec
>> respectively on my dev system which is 4X slower. I'm not sure if this
>> is okay.

> The cost, versus the odds of ever detecting a problem, doesn't
> seem like a good tradeoff.

I took a quick look and noted that on buildfarm member longfin
(to take a random example that's sitting a few feet from me),
the time for contrib-install-check went from 34 seconds before
this patch to 40 seconds after.  I find that completely
unacceptable compared to the likely value of this test case.

            regards, tom lane



Re: Fdw batch insert error out when set batch_size > 65535

От
Tomas Vondra
Дата:

On 6/9/21 8:28 AM, Tom Lane wrote:
> I wrote:
>> Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> writes:
>>>> I've added a simple regression test to postgres_fdw, testing that batch
>>>> sizes > 65535 work fine, and pushed the fix.
> 
>>> I was earlier thinking of adding one, but stopped because it might
>>> increase the regression test execution time. It looks like that's true
>>> - with and without the test case it takes 17 sec and 4 sec
>>> respectively on my dev system which is 4X slower. I'm not sure if this
>>> is okay.
> 
>> The cost, versus the odds of ever detecting a problem, doesn't
>> seem like a good tradeoff.
> 
> I took a quick look and noted that on buildfarm member longfin
> (to take a random example that's sitting a few feet from me),
> the time for contrib-install-check went from 34 seconds before
> this patch to 40 seconds after.  I find that completely
> unacceptable compared to the likely value of this test case.
> 

Note that the problem here is [1] - we're creating a lot of slots 
referencing the same tuple descriptor, which inflates the duration. 
There's a fix in the other thread, which eliminates ~99% of the 
overhead. I plan to push that fix soon (a day or two).


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Fdw batch insert error out when set batch_size > 65535

От
Tomas Vondra
Дата:

On 6/9/21 12:22 PM, Tomas Vondra wrote:
> 
> 
> On 6/9/21 8:28 AM, Tom Lane wrote:
>> I wrote:
>>> Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> writes:
>>>>> I've added a simple regression test to postgres_fdw, testing that 
>>>>> batch
>>>>> sizes > 65535 work fine, and pushed the fix.
>>
>>>> I was earlier thinking of adding one, but stopped because it might
>>>> increase the regression test execution time. It looks like that's true
>>>> - with and without the test case it takes 17 sec and 4 sec
>>>> respectively on my dev system which is 4X slower. I'm not sure if this
>>>> is okay.
>>
>>> The cost, versus the odds of ever detecting a problem, doesn't
>>> seem like a good tradeoff.
>>
>> I took a quick look and noted that on buildfarm member longfin
>> (to take a random example that's sitting a few feet from me),
>> the time for contrib-install-check went from 34 seconds before
>> this patch to 40 seconds after.  I find that completely
>> unacceptable compared to the likely value of this test case.
>>
> 
> Note that the problem here is [1] - we're creating a lot of slots 
> referencing the same tuple descriptor, which inflates the duration. 
> There's a fix in the other thread, which eliminates ~99% of the 
> overhead. I plan to push that fix soon (a day or two).
> 

Forgot to add the link:

[1] 
https://www.postgresql.org/message-id/ebbbcc7d-4286-8c28-0272-61b4753af761%40enterprisedb.com


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Fdw batch insert error out when set batch_size > 65535

От
Tom Lane
Дата:
Tomas Vondra <tomas.vondra@enterprisedb.com> writes:
> Note that the problem here is [1] - we're creating a lot of slots 
> referencing the same tuple descriptor, which inflates the duration. 
> There's a fix in the other thread, which eliminates ~99% of the 
> overhead. I plan to push that fix soon (a day or two).

Oh, okay, as long as there's a plan to bring the time back down.

            regards, tom lane



Re: Fdw batch insert error out when set batch_size > 65535

От
Tomas Vondra
Дата:
On 6/9/21 3:28 PM, Tom Lane wrote:
> Tomas Vondra <tomas.vondra@enterprisedb.com> writes:
>> Note that the problem here is [1] - we're creating a lot of slots
>> referencing the same tuple descriptor, which inflates the duration.
>> There's a fix in the other thread, which eliminates ~99% of the
>> overhead. I plan to push that fix soon (a day or two).
> 
> Oh, okay, as long as there's a plan to bring the time back down.
> 

Yeah. Sorry for not mentioning this in the original message about the 
new regression test.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Fdw batch insert error out when set batch_size > 65535

От
Tomas Vondra
Дата:

On 6/9/21 4:05 PM, Tomas Vondra wrote:
> On 6/9/21 3:28 PM, Tom Lane wrote:
>> Tomas Vondra <tomas.vondra@enterprisedb.com> writes:
>>> Note that the problem here is [1] - we're creating a lot of slots
>>> referencing the same tuple descriptor, which inflates the duration.
>>> There's a fix in the other thread, which eliminates ~99% of the
>>> overhead. I plan to push that fix soon (a day or two).
>>
>> Oh, okay, as long as there's a plan to bring the time back down.
>>
> 
> Yeah. Sorry for not mentioning this in the original message about the
> new regression test.
> 

I've pushed a fix addressing the performance issue.

There's one caveat, though - for regular builds the slowdown is pretty
much eliminated. But with valgrind it's still considerably slower. For
postgres_fdw the "make check" used to take ~5 minutes for me, now it
takes >1h. And yes, this is entirely due to the new test case which is
generating / inserting 70k rows. So maybe the test case is not worth it
after all, and we should get rid of it.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Fdw batch insert error out when set batch_size > 65535

От
Tom Lane
Дата:
Tomas Vondra <tomas.vondra@enterprisedb.com> writes:
> There's one caveat, though - for regular builds the slowdown is pretty
> much eliminated. But with valgrind it's still considerably slower. For
> postgres_fdw the "make check" used to take ~5 minutes for me, now it
> takes >1h. And yes, this is entirely due to the new test case which is
> generating / inserting 70k rows. So maybe the test case is not worth it
> after all, and we should get rid of it.

I bet the CLOBBER_CACHE animals won't like it much either.

I suggest what we do is leave it in place for long enough to get
a round of reports from those slow animals, and then (assuming
those reports are positive) drop the test.

            regards, tom lane



Re: Fdw batch insert error out when set batch_size > 65535

От
Alvaro Herrera
Дата:
On 2021-Jun-12, Tomas Vondra wrote:

> There's one caveat, though - for regular builds the slowdown is pretty
> much eliminated. But with valgrind it's still considerably slower. For
> postgres_fdw the "make check" used to take ~5 minutes for me, now it
> takes >1h. And yes, this is entirely due to the new test case which is
> generating / inserting 70k rows. So maybe the test case is not worth it
> after all, and we should get rid of it.

Hmm, what if the table is made 1600 columns wide -- would inserting 41
rows be sufficient to trigger the problem case?  If it does, maybe it
would reduce the runtime for valgrind/cache-clobber animals enough that
it's no longer a concern.

-- 
Álvaro Herrera       Valdivia, Chile
"At least to kernel hackers, who really are human, despite occasional
rumors to the contrary" (LWN.net)



Re: Fdw batch insert error out when set batch_size > 65535

От
Tomas Vondra
Дата:

On 6/13/21 2:40 AM, Alvaro Herrera wrote:
> On 2021-Jun-12, Tomas Vondra wrote:
> 
>> There's one caveat, though - for regular builds the slowdown is pretty
>> much eliminated. But with valgrind it's still considerably slower. For
>> postgres_fdw the "make check" used to take ~5 minutes for me, now it
>> takes >1h. And yes, this is entirely due to the new test case which is
>> generating / inserting 70k rows. So maybe the test case is not worth it
>> after all, and we should get rid of it.
> 
> Hmm, what if the table is made 1600 columns wide -- would inserting 41
> rows be sufficient to trigger the problem case?  If it does, maybe it
> would reduce the runtime for valgrind/cache-clobber animals enough that
> it's no longer a concern.
> 

Good idea. I gave that a try, creating a table with 1500 columns and
inserting 50 rows (so 75k parameters). See the attached patch.

While this cuts the runtime about in half (to ~30 minutes on my laptop),
that's probably not enough - it's still about ~6x longer than it used to
take. All these timings are with valgrind.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Вложения

Re: Fdw batch insert error out when set batch_size > 65535

От
Bharath Rupireddy
Дата:
On Sun, Jun 13, 2021 at 6:10 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> On 2021-Jun-12, Tomas Vondra wrote:
>
> > There's one caveat, though - for regular builds the slowdown is pretty
> > much eliminated. But with valgrind it's still considerably slower. For
> > postgres_fdw the "make check" used to take ~5 minutes for me, now it
> > takes >1h. And yes, this is entirely due to the new test case which is
> > generating / inserting 70k rows. So maybe the test case is not worth it
> > after all, and we should get rid of it.
>
> Hmm, what if the table is made 1600 columns wide -- would inserting 41
> rows be sufficient to trigger the problem case?  If it does, maybe it
> would reduce the runtime for valgrind/cache-clobber animals enough that
> it's no longer a concern.

Yeah, that's a good idea. PSA patch that creates the table of 1600
columns and inserts 41 rows into the foreign table. If the batch_size
adjustment fix isn't there, we will hit the error. On my dev system,
postgres_fdw contrib regression tests execution time: with and without
the attached patch 4.5 sec and 5.7 sec respectively.

On Sun, Jun 13, 2021 at 7:25 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
> Good idea. I gave that a try, creating a table with 1500 columns and
> inserting 50 rows (so 75k parameters). See the attached patch.

Thanks for the patch. I also prepared a patch, just sharing. I'm okay
if it's ignored.

With Regards,
Bharath Rupireddy.

Вложения

Re: Fdw batch insert error out when set batch_size > 65535

От
Tomas Vondra
Дата:

On 6/13/21 5:25 PM, Bharath Rupireddy wrote:
> On Sun, Jun 13, 2021 at 6:10 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>>
>> On 2021-Jun-12, Tomas Vondra wrote:
>>
>>> There's one caveat, though - for regular builds the slowdown is pretty
>>> much eliminated. But with valgrind it's still considerably slower. For
>>> postgres_fdw the "make check" used to take ~5 minutes for me, now it
>>> takes >1h. And yes, this is entirely due to the new test case which is
>>> generating / inserting 70k rows. So maybe the test case is not worth it
>>> after all, and we should get rid of it.
>>
>> Hmm, what if the table is made 1600 columns wide -- would inserting 41
>> rows be sufficient to trigger the problem case?  If it does, maybe it
>> would reduce the runtime for valgrind/cache-clobber animals enough that
>> it's no longer a concern.
> 
> Yeah, that's a good idea. PSA patch that creates the table of 1600
> columns and inserts 41 rows into the foreign table. If the batch_size
> adjustment fix isn't there, we will hit the error. On my dev system,
> postgres_fdw contrib regression tests execution time: with and without
> the attached patch 4.5 sec and 5.7 sec respectively.
> 

But we're discussing cases with valgrind and/or CLOBBER_CACHE_ALWAYS.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Fdw batch insert error out when set batch_size > 65535

От
Bharath Rupireddy
Дата:
On Sun, Jun 13, 2021 at 9:28 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
> On 6/13/21 5:25 PM, Bharath Rupireddy wrote:
> > On Sun, Jun 13, 2021 at 6:10 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
> >>
> >> On 2021-Jun-12, Tomas Vondra wrote:
> >>
> >>> There's one caveat, though - for regular builds the slowdown is pretty
> >>> much eliminated. But with valgrind it's still considerably slower. For
> >>> postgres_fdw the "make check" used to take ~5 minutes for me, now it
> >>> takes >1h. And yes, this is entirely due to the new test case which is
> >>> generating / inserting 70k rows. So maybe the test case is not worth it
> >>> after all, and we should get rid of it.
> >>
> >> Hmm, what if the table is made 1600 columns wide -- would inserting 41
> >> rows be sufficient to trigger the problem case?  If it does, maybe it
> >> would reduce the runtime for valgrind/cache-clobber animals enough that
> >> it's no longer a concern.
> >
> > Yeah, that's a good idea. PSA patch that creates the table of 1600
> > columns and inserts 41 rows into the foreign table. If the batch_size
> > adjustment fix isn't there, we will hit the error. On my dev system,
> > postgres_fdw contrib regression tests execution time: with and without
> > the attached patch 4.5 sec and 5.7 sec respectively.
> >
>
> But we're discussing cases with valgrind and/or CLOBBER_CACHE_ALWAYS.

Okay. Here are the readings on my dev system:
1) on master with the existing test case with inserting 70K rows:
4263200 ms (71.05 min)
2) with Tomas's patch with the test case modified with 1500 table
columns and 50 rows, (majority of the time ~30min it took in SELECT
create_batch_tables(1500); statement. I measured this time manually
looking at the start and end time of the statement - 6649312 ms (110.8
min)
3) with my patch with test case modified with 1600 table columns and
41 rows: 4003007 ms  (66.71 min)
4) on master without the test case at all: 3770722 ms (62.84 min)

With Regards,
Bharath Rupireddy.



Re: Fdw batch insert error out when set batch_size > 65535

От
Bharath Rupireddy
Дата:
On Mon, Jun 14, 2021, 5:33 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:
Okay. Here are the readings on my dev system:
1) on master with the existing test case with inserting 70K rows:
4263200 ms (71.05 min)
2) with Tomas's patch with the test case modified with 1500 table
columns and 50 rows, (majority of the time ~30min it took in SELECT
create_batch_tables(1500); statement. I measured this time manually
looking at the start and end time of the statement - 6649312 ms (110.8
min)
3) with my patch with test case modified with 1600 table columns and
41 rows: 4003007 ms  (66.71 min)
4) on master without the test case at all: 3770722 ms (62.84 min)

I forgot to mention one thing: I ran the above tests with CLOBBER_CACHE_ALWAYS.

Regards,
Bharath Rupireddy.

Re: Fdw batch insert error out when set batch_size > 65535

От
Andres Freund
Дата:
Hi,

On 2021-06-11 18:44:28 -0400, Tom Lane wrote:
> Tomas Vondra <tomas.vondra@enterprisedb.com> writes:
> > There's one caveat, though - for regular builds the slowdown is pretty
> > much eliminated. But with valgrind it's still considerably slower. For
> > postgres_fdw the "make check" used to take ~5 minutes for me, now it
> > takes >1h. And yes, this is entirely due to the new test case which is
> > generating / inserting 70k rows. So maybe the test case is not worth it
> > after all, and we should get rid of it.
> 
> I bet the CLOBBER_CACHE animals won't like it much either.
> 
> I suggest what we do is leave it in place for long enough to get
> a round of reports from those slow animals, and then (assuming
> those reports are positive) drop the test.

I just encountered this test because it doesn't succeed on a 32bit system with
address sanitizer enabled - it runs out of memory. At that point there are
"just" 29895 parameters parsed...

It's also the slowest step on skink (valgrind animal), taking nearly an hour.

I think two years later is long enough to have some confidence in this being
fixed?

Greetings,

Andres Freund



Re: Fdw batch insert error out when set batch_size > 65535

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> On 2021-06-11 18:44:28 -0400, Tom Lane wrote:
>> I suggest what we do is leave it in place for long enough to get
>> a round of reports from those slow animals, and then (assuming
>> those reports are positive) drop the test.

> I think two years later is long enough to have some confidence in this being
> fixed?

+1, time to drop it (in the back branches too).

            regards, tom lane



Re: Fdw batch insert error out when set batch_size > 65535

От
Tomas Vondra
Дата:
On 7/2/23 15:23, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
>> On 2021-06-11 18:44:28 -0400, Tom Lane wrote:
>>> I suggest what we do is leave it in place for long enough to get
>>> a round of reports from those slow animals, and then (assuming
>>> those reports are positive) drop the test.
> 
>> I think two years later is long enough to have some confidence in this being
>> fixed?
> 
> +1, time to drop it (in the back branches too).
> 

OK, will do (unless someone else wants to handle this) on Monday.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Fdw batch insert error out when set batch_size > 65535

От
Tomas Vondra
Дата:
On 7/2/23 15:50, Tomas Vondra wrote:
> On 7/2/23 15:23, Tom Lane wrote:
>> Andres Freund <andres@anarazel.de> writes:
>>> On 2021-06-11 18:44:28 -0400, Tom Lane wrote:
>>>> I suggest what we do is leave it in place for long enough to get
>>>> a round of reports from those slow animals, and then (assuming
>>>> those reports are positive) drop the test.
>>
>>> I think two years later is long enough to have some confidence in this being
>>> fixed?
>>
>> +1, time to drop it (in the back branches too).
>>
> 
> OK, will do (unless someone else wants to handle this) on Monday.
> 

FWIW I've removed the test from all branches where it was present.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company