Re: Partial aggregates pushdown

Поиск
Список
Период
Сортировка
От Bruce Momjian
Тема Re: Partial aggregates pushdown
Дата
Msg-id ZJD2BWVqRlJgz3n6@momjian.us
обсуждение исходный текст
Ответ на RE: Partial aggregates pushdown  ("Fujii.Yuki@df.MitsubishiElectric.co.jp" <Fujii.Yuki@df.MitsubishiElectric.co.jp>)
Ответы Re: Partial aggregates pushdown  (Alexander Pyhalov <a.pyhalov@postgrespro.ru>)
Список pgsql-hackers
On Tue, Jun 13, 2023 at 02:18:15AM +0000, Fujii.Yuki@df.MitsubishiElectric.co.jp wrote:
> Hi Mr.Momjian.
> 
> Thank you for advises.
> 
> > From: Bruce Momjian <bruce@momjian.us>
> > Sent: Monday, June 12, 2023 10:38 PM
> > > I understand what the problem is. I will put a mechanism maintaining compatibility into the patch.
> > > I believe there are three approaches.
> > > Approach 1-1 is preferable because it does not require additional options for postgres_fdw.
> > > I will revise the patch according to Approach 1-1, unless otherwise commented.
> > >
> > > Approach1:
> > > I ensure that postgres_fdw retrieves the version of each remote server
> > > and does not partial aggregate pushd down if the server version is less than 17.
> > > There are two approaches to obtaining remote server versions.
> > > Approach1-1: postgres_fdw connects a remote server and use PQserverVersion().
> > > Approach1-2: Adding a postgres_fdw option about a remote server version (like "server_version").
> > >
> > > Approach2:
> > > Adding a postgres_fdw option for partial aggregate pushdown is enable
> > > or not (like enable_partial_aggregate_pushdown).
> > 
> > These are good questions.  Adding a postgres_fdw option called enable_partial_aggregate_pushdown helps make the
> > purpose of the option clear, but remote_version can be used for future breakage as well.
> > 
> > I think remote_version is the best idea, and in the documentation for the option, let's explicitly say it is useful
todisable
 
> > partial aggregates pushdown on pre-PG 17 servers.  If we need to use the option for other cases, we can just update
the
> > documentation.  When the option is blank, the default, everything is pushed down.
> > 
> > I see remote_version a logical addition to match our "extensions" option that controls what extension functions can
be
> > pushed down.
> 
> Thank you for your perspective.
> So, of the approaches I have presented, you think that approach 1-2 is
> preferable and that the option name remote_server is preferable?
> Indeed, the option of a remote version may have other uses.
> However, this information can be obtained by connecting to a remote server, 
> I'm concerned that some people may find this option redundant.
> 
> Is the problem with approach 1-1 because the user cannot decide whether to include the compatibility check in the
decisionto do partial aggregate pushdown or not?
 
> # If Approach 1-1 is taken, the problem is that this feature cannot be used for all bait-in aggregate functions
> # when the remote server is older than PG17.
> If so, Approache1-3 below seem more desirable.
> Would it be possible for us to hear your thoughts?
> 
> Approache1-3:We add a postgres_fdw option about a compatibility check for partial aggregate pushdown
> (like "enable_aggpartialfunc_compatibility_check"). This option is false, the default.
> When this option is true, postgres_fdw obtains the remote server version by connecting the remote server and using
PQserverVersion().
 
> And if the remote server version is older than PG17, then the partial aggregate pushdown feature is enable for all
buit-inaggregate functions.
 
> Otherwise the partial aggregate pushdown feature is disable for all buit-in aggregate functions.

Apologies for the delay in my reply to this email.  I looked into the
existing code and I found three things:

1)  PQserverVersion() just pulls the conn->sversion value from the
existing connection because pqSaveParameterStatus() pulls the
server_version sent by the backend;  no need to issue SELECT version().

2)  postgres_fdw already has nine calls to GetConnection(), and only
opens a connection if it already doesn't have one.  Here is an example:

    /* Get the remote estimate */
    conn = GetConnection(fpinfo->user, false, NULL);
    get_remote_estimate(sql.data, conn, &rows, &width,
                &startup_cost, &total_cost);
    ReleaseConnection(conn);

Therefore, it seems like it would be near-zero cost to just call conn =
GetConnection() and then PQserverVersion(conn), and ReleaseConnection().
You can then use the return value of PQserverVersion() to determine if
you can push down partial aggregates.

3)  Looking at postgresAcquireSampleRowsFunc(), I see this exact method
used:

    conn = GetConnection(user, false, NULL);

    /* We'll need server version, so fetch it now. */
    server_version_num = PQserverVersion(conn);

    ...

    if ((server_version_num < 95000) &&
    (method == ANALYZE_SAMPLE_SYSTEM ||
     method == ANALYZE_SAMPLE_BERNOULLI))
    ereport(ERROR,
        (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
         errmsg("remote server does not support TABLESAMPLE feature")));

I am sorry if you already knew all this, but I didn't.

-- 
  Bruce Momjian  <bruce@momjian.us>        https://momjian.us
  EDB                                      https://enterprisedb.com

  Only you can decide what is important to you.



В списке pgsql-hackers по дате отправления:

Предыдущее
От: Tomas Vondra
Дата:
Сообщение: Re: Do we want a hashset type?
Следующее
От: Bruce Momjian
Дата:
Сообщение: Re: Document that server will start even if it's unable to open some TCP/IP ports