Обсуждение: ParallelFinish-hook of FDW/CSP (Re: Steps inside ExecEndGather)

Поиск
Список
Период
Сортировка

ParallelFinish-hook of FDW/CSP (Re: Steps inside ExecEndGather)

От
Kouhei Kaigai
Дата:
Hello,

The attached patch implements the suggestion by Amit before.

What I'm motivated is to collect extra run-time statistics specific
to a particular ForeignScan/CustomScan, not only the standard
Instrumentation; like DMA transfer rate or execution time of GPU
kernels in my case.

Per-node DSM toc is one of the best way to return run-time statistics
to the master backend, because FDW/CSP can assign arbitrary length of
the region according to its needs. It is quite easy to require.
However, one problem is, the per-node DSM toc is already released when
ExecEndNode() is called on the child node of Gather.

This patch allows extensions to get control on the master backend's
context when all the worker node gets finished but prior to release
of the DSM segment. If FDW/CSP has its special statistics on the
segment, it can move to the private memory area for EXPLAIN output
or something other purpose.

One design consideration is whether the hook shall be called from
ExecParallelRetrieveInstrumentation() or ExecParallelFinish().
The former is a function to retrieve the standard Instrumentation
information, thus, it is valid only if EXPLAIN ANALYZE.
On the other hands, if we put entrypoint at ExecParallelFinish(),
extension can get control regardless of EXPLAIN ANALYZE, however,
it also needs an extra planstate_tree_walker().

Right now, we don't assume anything onto the requirement by FDW/CSP.
It may want run-time statistics regardless of EXPLAIN ANALYZE, thus,
hook shall be invoked always when Gather node confirmed termination
of the worker processes.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai@ak.jp.nec.com>


> -----Original Message-----
> From: Amit Kapila [mailto:amit.kapila16@gmail.com]
> Sent: Monday, October 17, 2016 11:22 AM
> To: Kaigai Kouhei(海外 浩平)
> Cc: Robert Haas; pgsql-hackers
> Subject: ##freemail## Re: [HACKERS] Steps inside ExecEndGather
> 
> On Mon, Oct 17, 2016 at 6:22 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
> > Hello,
> >
> > I'm now trying to carry extra performance statistics on CustomScan
> > (like DMA transfer rate, execution time of GPU kernels, etc...)
> > from parallel workers to the leader process using the DSM segment
> > attached by the parallel-context.
> > We can require an arbitrary length of DSM using ExecCustomScanEstimate
> > hook by extension, then it looks leader/worker can share the DSM area.
> > However, we have a problem on this design.
> >
> > Below is the implementation of ExecEndGather().
> >
> >   void
> >   ExecEndGather(GatherState *node)
> >   {
> >       ExecShutdownGather(node);
> >       ExecFreeExprContext(&node->ps);
> >       ExecClearTuple(node->ps.ps_ResultTupleSlot);
> >       ExecEndNode(outerPlanState(node));
> >   }
> >
> > It calls ExecShutdownGather() prior to the recursive call of ExecEndNode().
> > The DSM segment shall be released on this call, so child node cannot
> > reference the DSM at the time of ExecEndNode().
> >
> 
> Before releasing DSM, we do collect all the statistics or
> instrumentation information of each node.  Refer
> ExecParallelFinish()->ExecParallelRetrieveInstrumentation(), so I am
> wondering why can't you collect the additional information in the same
> way?
> 
> 
> --
> With Regards,
> Amit Kapila.
> EnterpriseDB: http://www.enterprisedb.com

Вложения

Re: ParallelFinish-hook of FDW/CSP (Re: Steps inside ExecEndGather)

От
Haribabu Kommi
Дата:


On Tue, Nov 1, 2016 at 1:33 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
Hello,

The attached patch implements the suggestion by Amit before.

What I'm motivated is to collect extra run-time statistics specific
to a particular ForeignScan/CustomScan, not only the standard
Instrumentation; like DMA transfer rate or execution time of GPU
kernels in my case.

Per-node DSM toc is one of the best way to return run-time statistics
to the master backend, because FDW/CSP can assign arbitrary length of
the region according to its needs. It is quite easy to require.
However, one problem is, the per-node DSM toc is already released when
ExecEndNode() is called on the child node of Gather.

This patch allows extensions to get control on the master backend's
context when all the worker node gets finished but prior to release
of the DSM segment. If FDW/CSP has its special statistics on the
segment, it can move to the private memory area for EXPLAIN output
or something other purpose.

One design consideration is whether the hook shall be called from
ExecParallelRetrieveInstrumentation() or ExecParallelFinish().
The former is a function to retrieve the standard Instrumentation
information, thus, it is valid only if EXPLAIN ANALYZE.
On the other hands, if we put entrypoint at ExecParallelFinish(),
extension can get control regardless of EXPLAIN ANALYZE, however,
it also needs an extra planstate_tree_walker().

Right now, we don't assume anything onto the requirement by FDW/CSP.
It may want run-time statistics regardless of EXPLAIN ANALYZE, thus,
hook shall be invoked always when Gather node confirmed termination
of the worker processes.


Moved to next CF with "needs review" status.

Regards,
Hari Babu
Fujitsu Australia

Re: ParallelFinish-hook of FDW/CSP (Re: [HACKERS] Steps inside ExecEndGather)

От
Michael Paquier
Дата:
On Fri, Dec 2, 2016 at 9:20 PM, Haribabu Kommi <kommi.haribabu@gmail.com> wrote:
> Moved to next CF with "needs review" status.

There has not been much interest in this patch, moved again, this time
to CF 2017-03.
-- 
Michael



Re: ParallelFinish-hook of FDW/CSP (Re: [HACKERS] Steps inside ExecEndGather)

От
Claudio Freire
Дата:
On Mon, Oct 31, 2016 at 11:33 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
> Hello,
>
> The attached patch implements the suggestion by Amit before.
>
> What I'm motivated is to collect extra run-time statistics specific
> to a particular ForeignScan/CustomScan, not only the standard
> Instrumentation; like DMA transfer rate or execution time of GPU
> kernels in my case.
>
> Per-node DSM toc is one of the best way to return run-time statistics
> to the master backend, because FDW/CSP can assign arbitrary length of
> the region according to its needs. It is quite easy to require.
> However, one problem is, the per-node DSM toc is already released when
> ExecEndNode() is called on the child node of Gather.
>
> This patch allows extensions to get control on the master backend's
> context when all the worker node gets finished but prior to release
> of the DSM segment. If FDW/CSP has its special statistics on the
> segment, it can move to the private memory area for EXPLAIN output
> or something other purpose.
>
> One design consideration is whether the hook shall be called from
> ExecParallelRetrieveInstrumentation() or ExecParallelFinish().
> The former is a function to retrieve the standard Instrumentation
> information, thus, it is valid only if EXPLAIN ANALYZE.
> On the other hands, if we put entrypoint at ExecParallelFinish(),
> extension can get control regardless of EXPLAIN ANALYZE, however,
> it also needs an extra planstate_tree_walker().

If the use case for this is to gather instrumentation, I'd suggest calling the
hook in RetrieveInstrumentation, and calling it appropriately. It would
make the intended use far clearer than it is now.

And if it saves some work, all the better.

Until there's a use case for a non-instrumentation hook in that place,
I wouldn't add it. This level of generality sounds like a solution waiting
for a problem to solve.



Re: ParallelFinish-hook of FDW/CSP (Re: [HACKERS] Steps insideExecEndGather)

От
Kouhei Kaigai
Дата:
> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Claudio Freire
> Sent: Saturday, February 04, 2017 8:47 AM
> To: Kaigai Kouhei(海外 浩平) <kaigai@ak.jp.nec.com>
> Cc: Amit Kapila <amit.kapila16@gmail.com>; Robert Haas
> <robertmhaas@gmail.com>; pgsql-hackers <pgsql-hackers@postgresql.org>
> Subject: Re: ParallelFinish-hook of FDW/CSP (Re: [HACKERS] Steps inside
> ExecEndGather)
> 
> On Mon, Oct 31, 2016 at 11:33 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com>
> wrote:
> > Hello,
> >
> > The attached patch implements the suggestion by Amit before.
> >
> > What I'm motivated is to collect extra run-time statistics specific to
> > a particular ForeignScan/CustomScan, not only the standard
> > Instrumentation; like DMA transfer rate or execution time of GPU
> > kernels in my case.
> >
> > Per-node DSM toc is one of the best way to return run-time statistics
> > to the master backend, because FDW/CSP can assign arbitrary length of
> > the region according to its needs. It is quite easy to require.
> > However, one problem is, the per-node DSM toc is already released when
> > ExecEndNode() is called on the child node of Gather.
> >
> > This patch allows extensions to get control on the master backend's
> > context when all the worker node gets finished but prior to release of
> > the DSM segment. If FDW/CSP has its special statistics on the segment,
> > it can move to the private memory area for EXPLAIN output or something
> > other purpose.
> >
> > One design consideration is whether the hook shall be called from
> > ExecParallelRetrieveInstrumentation() or ExecParallelFinish().
> > The former is a function to retrieve the standard Instrumentation
> > information, thus, it is valid only if EXPLAIN ANALYZE.
> > On the other hands, if we put entrypoint at ExecParallelFinish(),
> > extension can get control regardless of EXPLAIN ANALYZE, however, it
> > also needs an extra planstate_tree_walker().
> 
> If the use case for this is to gather instrumentation, I'd suggest calling
> the hook in RetrieveInstrumentation, and calling it appropriately. It would
> make the intended use far clearer than it is now.
> 
> And if it saves some work, all the better.
> 
> Until there's a use case for a non-instrumentation hook in that place, I
> wouldn't add it. This level of generality sounds like a solution waiting
> for a problem to solve.
>
The use cases I'd like to add are extension specific but significant for
performance analytics. These statistics are not included in Instrumentation.
For example, my problems are GPU execution time, data transfer ratio over
DMA, synchronization time for GPU task completion, and so on.
Only extension can know which attributes are collected during the execution,
and its data format. I don't think Instrumentation fits these requirements.
This is a problem I faced on the v9.6 based interface design, so I could
notice it.

Thanks,
----
PG-Strom Project / NEC OSS Promotion Center
KaiGai Kohei <kaigai@ak.jp.nec.com>


Re: ParallelFinish-hook of FDW/CSP (Re: [HACKERS] Steps inside ExecEndGather)

От
Claudio Freire
Дата:
On Sun, Feb 5, 2017 at 9:19 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
>> If the use case for this is to gather instrumentation, I'd suggest calling
>> the hook in RetrieveInstrumentation, and calling it appropriately. It would
>> make the intended use far clearer than it is now.
>>
>> And if it saves some work, all the better.
>>
>> Until there's a use case for a non-instrumentation hook in that place, I
>> wouldn't add it. This level of generality sounds like a solution waiting
>> for a problem to solve.
>>
> The use cases I'd like to add are extension specific but significant for
> performance analytics. These statistics are not included in Instrumentation.
> For example, my problems are GPU execution time, data transfer ratio over
> DMA, synchronization time for GPU task completion, and so on.
> Only extension can know which attributes are collected during the execution,
> and its data format. I don't think Instrumentation fits these requirements.
> This is a problem I faced on the v9.6 based interface design, so I could
> notice it.


What RetrieveInstrumentation currently does may not cover the
extension's specific instrumentation, but what I'm suggesting is
adding a hook, like the one you're proposing for ParallelFinish, that
would collect extension-specific instrumentation. Couple that with
extension-specific explain output and you'll get your use cases
covered, I think.



Re: ParallelFinish-hook of FDW/CSP (Re: [HACKERS] Steps inside ExecEndGather)

От
Claudio Freire
Дата:
On Mon, Feb 6, 2017 at 1:00 AM, Claudio Freire <klaussfreire@gmail.com> wrote:
> On Sun, Feb 5, 2017 at 9:19 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
>>> If the use case for this is to gather instrumentation, I'd suggest calling
>>> the hook in RetrieveInstrumentation, and calling it appropriately. It would
>>> make the intended use far clearer than it is now.
>>>
>>> And if it saves some work, all the better.
>>>
>>> Until there's a use case for a non-instrumentation hook in that place, I
>>> wouldn't add it. This level of generality sounds like a solution waiting
>>> for a problem to solve.
>>>
>> The use cases I'd like to add are extension specific but significant for
>> performance analytics. These statistics are not included in Instrumentation.
>> For example, my problems are GPU execution time, data transfer ratio over
>> DMA, synchronization time for GPU task completion, and so on.
>> Only extension can know which attributes are collected during the execution,
>> and its data format. I don't think Instrumentation fits these requirements.
>> This is a problem I faced on the v9.6 based interface design, so I could
>> notice it.
>
>
> What RetrieveInstrumentation currently does may not cover the
> extension's specific instrumentation, but what I'm suggesting is
> adding a hook, like the one you're proposing for ParallelFinish, that
> would collect extension-specific instrumentation. Couple that with
> extension-specific explain output and you'll get your use cases
> covered, I think.


In essence, you added a ParallelFinish that is called after
RetrieveInstrumentation. That's overreaching, for your use case.

If instrumentation is what you're collecting, it's far more correct,
and more readable, to have a hook called from inside
RetrieveInstrumentation that will retrieve extension-specific
instrumentation.

RetrieveInstrumentation itself doesn't need to parse that information,
that will be loaded into the custom scan nodes, and those are the ones
that will parse it when generating explain output.

So, in essence it's almost what you did with ParallelFinish, more
clearly aimed at collecting runtime statistics.



Re: ParallelFinish-hook of FDW/CSP (Re: [HACKERS] Steps insideExecEndGather)

От
Kouhei Kaigai
Дата:
> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Claudio Freire
> Sent: Monday, February 06, 2017 1:05 PM
> To: Kaigai Kouhei(海外 浩平) <kaigai@ak.jp.nec.com>
> Cc: Amit Kapila <amit.kapila16@gmail.com>; Robert Haas
> <robertmhaas@gmail.com>; pgsql-hackers <pgsql-hackers@postgresql.org>
> Subject: Re: ParallelFinish-hook of FDW/CSP (Re: [HACKERS] Steps inside
> ExecEndGather)
> 
> On Mon, Feb 6, 2017 at 1:00 AM, Claudio Freire <klaussfreire@gmail.com>
> wrote:
> > On Sun, Feb 5, 2017 at 9:19 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
> >>> If the use case for this is to gather instrumentation, I'd suggest
> >>> calling the hook in RetrieveInstrumentation, and calling it
> >>> appropriately. It would make the intended use far clearer than it is
> now.
> >>>
> >>> And if it saves some work, all the better.
> >>>
> >>> Until there's a use case for a non-instrumentation hook in that
> >>> place, I wouldn't add it. This level of generality sounds like a
> >>> solution waiting for a problem to solve.
> >>>
> >> The use cases I'd like to add are extension specific but significant
> >> for performance analytics. These statistics are not included in
> Instrumentation.
> >> For example, my problems are GPU execution time, data transfer ratio
> >> over DMA, synchronization time for GPU task completion, and so on.
> >> Only extension can know which attributes are collected during the
> >> execution, and its data format. I don't think Instrumentation fits these
> requirements.
> >> This is a problem I faced on the v9.6 based interface design, so I
> >> could notice it.
> >
> >
> > What RetrieveInstrumentation currently does may not cover the
> > extension's specific instrumentation, but what I'm suggesting is
> > adding a hook, like the one you're proposing for ParallelFinish, that
> > would collect extension-specific instrumentation. Couple that with
> > extension-specific explain output and you'll get your use cases
> > covered, I think.
> 
> 
> In essence, you added a ParallelFinish that is called after
> RetrieveInstrumentation. That's overreaching, for your use case.
> 
> If instrumentation is what you're collecting, it's far more correct, and
> more readable, to have a hook called from inside RetrieveInstrumentation
> that will retrieve extension-specific instrumentation.
> 
> RetrieveInstrumentation itself doesn't need to parse that information, that
> will be loaded into the custom scan nodes, and those are the ones that will
> parse it when generating explain output.
> 
> So, in essence it's almost what you did with ParallelFinish, more clearly
> aimed at collecting runtime statistics.
> 
I also had thought an idea to have extra space to Instrumentation structure,
however, it needs to make Instrumentation flexible-length structure according
to the custom format by CSP/FDW. Likely, it is not a good design.
As long as extension can retrieve its custom statistics on DSM area required
by ExecParallelEstimate(), I have no preference on the hook location.

One thing we may pay attention is, some extension (not mine) may want to
collect worker's statistics regardless of Instrumentation (in other words,
even if plan is not under EXPLAIN ANALYZE).
It is the reason why I didn't put a hook under the ExecParallelRetrieveInstrumentation().

Thanks,
----
PG-Strom Project / NEC OSS Promotion Center
KaiGai Kohei <kaigai@ak.jp.nec.com>


Re: ParallelFinish-hook of FDW/CSP (Re: [HACKERS] Steps inside ExecEndGather)

От
Claudio Freire
Дата:
On Mon, Feb 6, 2017 at 1:42 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
> I also had thought an idea to have extra space to Instrumentation structure,
> however, it needs to make Instrumentation flexible-length structure according
> to the custom format by CSP/FDW. Likely, it is not a good design.
> As long as extension can retrieve its custom statistics on DSM area required
> by ExecParallelEstimate(), I have no preference on the hook location.

That's what I had in mind: the hook happens there, but the extension
retrieves the information from some extension-specific DSM area, just
as it would on the ParallelFinish hook.

> One thing we may pay attention is, some extension (not mine) may want to
> collect worker's statistics regardless of Instrumentation (in other words,
> even if plan is not under EXPLAIN ANALYZE).
> It is the reason why I didn't put a hook under the ExecParallelRetrieveInstrumentation().

I don't think you should worry about that as long as it's a hypothetical case.

If/when some extension actually needs to do that, the design can be
discussed with a real use case at hand, and not a hypothetical one.



Re: ParallelFinish-hook of FDW/CSP (Re: [HACKERS] Steps insideExecEndGather)

От
Kouhei Kaigai
Дата:
Hello,

The attached patch is revised one.

Invocation of Exec(Foreign|Custom)ParallelFinish was moved to
ExecParallelRetrieveInstrumentation() not to walk on the plan-
state tree twice.
One (hypothetical) downside is, FDW/CSP can retrieve its own
run-time statistics only when query is executed under EXPLAIN
ANALYZE.

This enhancement allows FDW/CSP to collect its specific run-
time statistics more than Instrumentation, then show them as
output of EXPLAIN. My expected examples are GPU's kernel execution
time, DMA transfer ratio and so on. These statistics will never
appear in the Instrumentation structure, however, can be a hot-
point of performance bottleneck if CustomScan works on background
workers.

Thanks,
----
PG-Strom Project / NEC OSS Promotion Center
KaiGai Kohei <kaigai@ak.jp.nec.com>


> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Claudio Freire
> Sent: Monday, February 06, 2017 3:37 PM
> To: Kaigai Kouhei(海外 浩平) <kaigai@ak.jp.nec.com>
> Cc: Amit Kapila <amit.kapila16@gmail.com>; Robert Haas
> <robertmhaas@gmail.com>; pgsql-hackers <pgsql-hackers@postgresql.org>
> Subject: Re: ParallelFinish-hook of FDW/CSP (Re: [HACKERS] Steps inside
> ExecEndGather)
> 
> On Mon, Feb 6, 2017 at 1:42 AM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
> > I also had thought an idea to have extra space to Instrumentation
> > structure, however, it needs to make Instrumentation flexible-length
> > structure according to the custom format by CSP/FDW. Likely, it is not
> a good design.
> > As long as extension can retrieve its custom statistics on DSM area
> > required by ExecParallelEstimate(), I have no preference on the hook
> location.
> 
> That's what I had in mind: the hook happens there, but the extension
> retrieves the information from some extension-specific DSM area, just as
> it would on the ParallelFinish hook.
> 
> > One thing we may pay attention is, some extension (not mine) may want
> > to collect worker's statistics regardless of Instrumentation (in other
> > words, even if plan is not under EXPLAIN ANALYZE).
> > It is the reason why I didn't put a hook under the
> ExecParallelRetrieveInstrumentation().
> 
> I don't think you should worry about that as long as it's a hypothetical
> case.
> 
> If/when some extension actually needs to do that, the design can be discussed
> with a real use case at hand, and not a hypothetical one.
> 
> 
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make
> changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

Re: ParallelFinish-hook of FDW/CSP (Re: [HACKERS] Steps inside ExecEndGather)

От
Robert Haas
Дата:
On Fri, Feb 17, 2017 at 12:46 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com> wrote:
> The attached patch is revised one.
>
> Invocation of Exec(Foreign|Custom)ParallelFinish was moved to
> ExecParallelRetrieveInstrumentation() not to walk on the plan-
> state tree twice.
> One (hypothetical) downside is, FDW/CSP can retrieve its own
> run-time statistics only when query is executed under EXPLAIN
> ANALYZE.
>
> This enhancement allows FDW/CSP to collect its specific run-
> time statistics more than Instrumentation, then show them as
> output of EXPLAIN. My expected examples are GPU's kernel execution
> time, DMA transfer ratio and so on. These statistics will never
> appear in the Instrumentation structure, however, can be a hot-
> point of performance bottleneck if CustomScan works on background
> workers.

Would gather_shutdown_children_first.patch from
https://www.postgresql.org/message-id/CAFiTN-s5KuRuDrQCEpiHHzmVf7JTtbnb8eb10c-6AywJDxbWrA@mail.gmail.com
help with this problem also?  Suppose we did that, and then also added
an ExecShutdownCustom method.  Then you'd definitely be able to get
control before the DSM went away, either from ExecEndNode() or
ExecShutdownNode().

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: ParallelFinish-hook of FDW/CSP (Re: [HACKERS] Steps insideExecEndGather)

От
Kouhei Kaigai
Дата:
> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org
> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Robert Haas
> Sent: Monday, February 20, 2017 2:20 AM
> To: Kaigai Kouhei(海外 浩平) <kaigai@ak.jp.nec.com>
> Cc: Claudio Freire <klaussfreire@gmail.com>; Amit Kapila
> <amit.kapila16@gmail.com>; pgsql-hackers <pgsql-hackers@postgresql.org>
> Subject: Re: ParallelFinish-hook of FDW/CSP (Re: [HACKERS] Steps inside
> ExecEndGather)
> 
> On Fri, Feb 17, 2017 at 12:46 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com>
> wrote:
> > The attached patch is revised one.
> >
> > Invocation of Exec(Foreign|Custom)ParallelFinish was moved to
> > ExecParallelRetrieveInstrumentation() not to walk on the plan- state
> > tree twice.
> > One (hypothetical) downside is, FDW/CSP can retrieve its own run-time
> > statistics only when query is executed under EXPLAIN ANALYZE.
> >
> > This enhancement allows FDW/CSP to collect its specific run- time
> > statistics more than Instrumentation, then show them as output of
> > EXPLAIN. My expected examples are GPU's kernel execution time, DMA
> > transfer ratio and so on. These statistics will never appear in the
> > Instrumentation structure, however, can be a hot- point of performance
> > bottleneck if CustomScan works on background workers.
> 
> Would gather_shutdown_children_first.patch from
> https://www.postgresql.org/message-id/CAFiTN-s5KuRuDrQCEpiHHzmVf7JTtbn
> b8eb10c-6AywJDxbWrA@mail.gmail.com
> help with this problem also?  Suppose we did that, and then also added an
> ExecShutdownCustom method.  Then you'd definitely be able to get control
> before the DSM went away, either from ExecEndNode() or ExecShutdownNode().
> 
Ah, yes, I couldn't find any problem around the above approach.
ExecShutdownGather() can be called by either ExecShutdownNode() or
ExecEndGather(). This patch allows to have an entrypoint for CSP/FDW
prior to release of the DSM.

Thanks,
----
PG-Strom Project / NEC OSS Promotion Center
KaiGai Kohei <kaigai@ak.jp.nec.com>


Re: ParallelFinish-hook of FDW/CSP (Re: [HACKERS] Steps inside ExecEndGather)

От
Kohei KaiGai
Дата:
Hello,

This attached patch re-designed the previous v2 according to Robert's comment.
All I had to do is putting entrypoint for ForeignScan/CustomScan at
ExecShutdownNode,
because it is already modified to call child node first, earlier than
ExecShutdownGather
which releases DSM segment.

Thanks,

2017-02-20 9:25 GMT+09:00 Kouhei Kaigai <kaigai@ak.jp.nec.com>:
>> -----Original Message-----
>> From: pgsql-hackers-owner@postgresql.org
>> [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Robert Haas
>> Sent: Monday, February 20, 2017 2:20 AM
>> To: Kaigai Kouhei(海外 浩平) <kaigai@ak.jp.nec.com>
>> Cc: Claudio Freire <klaussfreire@gmail.com>; Amit Kapila
>> <amit.kapila16@gmail.com>; pgsql-hackers <pgsql-hackers@postgresql.org>
>> Subject: Re: ParallelFinish-hook of FDW/CSP (Re: [HACKERS] Steps inside
>> ExecEndGather)
>>
>> On Fri, Feb 17, 2017 at 12:46 PM, Kouhei Kaigai <kaigai@ak.jp.nec.com>
>> wrote:
>> > The attached patch is revised one.
>> >
>> > Invocation of Exec(Foreign|Custom)ParallelFinish was moved to
>> > ExecParallelRetrieveInstrumentation() not to walk on the plan- state
>> > tree twice.
>> > One (hypothetical) downside is, FDW/CSP can retrieve its own run-time
>> > statistics only when query is executed under EXPLAIN ANALYZE.
>> >
>> > This enhancement allows FDW/CSP to collect its specific run- time
>> > statistics more than Instrumentation, then show them as output of
>> > EXPLAIN. My expected examples are GPU's kernel execution time, DMA
>> > transfer ratio and so on. These statistics will never appear in the
>> > Instrumentation structure, however, can be a hot- point of performance
>> > bottleneck if CustomScan works on background workers.
>>
>> Would gather_shutdown_children_first.patch from
>> https://www.postgresql.org/message-id/CAFiTN-s5KuRuDrQCEpiHHzmVf7JTtbn
>> b8eb10c-6AywJDxbWrA@mail.gmail.com
>> help with this problem also?  Suppose we did that, and then also added an
>> ExecShutdownCustom method.  Then you'd definitely be able to get control
>> before the DSM went away, either from ExecEndNode() or ExecShutdownNode().
>>
> Ah, yes, I couldn't find any problem around the above approach.
> ExecShutdownGather() can be called by either ExecShutdownNode() or
> ExecEndGather(). This patch allows to have an entrypoint for CSP/FDW
> prior to release of the DSM.
>
> Thanks,
> ----
> PG-Strom Project / NEC OSS Promotion Center
> KaiGai Kohei <kaigai@ak.jp.nec.com>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers



--
KaiGai Kohei <kaigai@kaigai.gr.jp>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

Re: ParallelFinish-hook of FDW/CSP (Re: [HACKERS] Steps inside ExecEndGather)

От
Robert Haas
Дата:
On Fri, Feb 24, 2017 at 7:29 PM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote:
> This attached patch re-designed the previous v2 according to Robert's comment.
> All I had to do is putting entrypoint for ForeignScan/CustomScan at
> ExecShutdownNode,
> because it is already modified to call child node first, earlier than
> ExecShutdownGather
> which releases DSM segment.

Aside from the documentation, which needs some work, this looks fine
to me on a quick read.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: ParallelFinish-hook of FDW/CSP (Re: [HACKERS] Steps inside ExecEndGather)

От
Claudio Freire
Дата:
On Fri, Feb 24, 2017 at 1:24 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, Feb 24, 2017 at 7:29 PM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote:
>> This attached patch re-designed the previous v2 according to Robert's comment.
>> All I had to do is putting entrypoint for ForeignScan/CustomScan at
>> ExecShutdownNode,
>> because it is already modified to call child node first, earlier than
>> ExecShutdownGather
>> which releases DSM segment.
>
> Aside from the documentation, which needs some work, this looks fine
> to me on a quick read.

LGTM too.

You may want to clarify in the docs when the hook is called in
relation to other hooks too.



Re: ParallelFinish-hook of FDW/CSP (Re: [HACKERS] Steps inside ExecEndGather)

От
Kohei KaiGai
Дата:
2017-02-25 1:40 GMT+09:00 Claudio Freire <klaussfreire@gmail.com>:
> On Fri, Feb 24, 2017 at 1:24 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Fri, Feb 24, 2017 at 7:29 PM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote:
>>> This attached patch re-designed the previous v2 according to Robert's comment.
>>> All I had to do is putting entrypoint for ForeignScan/CustomScan at
>>> ExecShutdownNode,
>>> because it is already modified to call child node first, earlier than
>>> ExecShutdownGather
>>> which releases DSM segment.
>>
>> Aside from the documentation, which needs some work, this looks fine
>> to me on a quick read.
>
> LGTM too.
>
> You may want to clarify in the docs when the hook is called in
> relation to other hooks too.
>
I tried to add a description how custom-scan callbacks performs under the
executor, and when invoked roughly.
However, it is fundamentally not easy for most of people because it assumes
FDW/CSP author understand the overall behavior of optimizer / executor, not
only APIs specifications.

Thanks,
-- 
KaiGai Kohei <kaigai@kaigai.gr.jp>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Вложения

Re: ParallelFinish-hook of FDW/CSP (Re: [HACKERS] Steps inside ExecEndGather)

От
Robert Haas
Дата:
On Sat, Feb 25, 2017 at 10:00 AM, Kohei KaiGai <kaigai@kaigai.gr.jp> wrote:
> I tried to add a description how custom-scan callbacks performs under the
> executor, and when invoked roughly.
> However, it is fundamentally not easy for most of people because it assumes
> FDW/CSP author understand the overall behavior of optimizer / executor, not
> only APIs specifications.

I think the statements about Shutdown only being useful in parallel
mode aren't quite striking the right tone.  In theory a node can hold
any kind of resources and find it worthwhile to release them at
shutdown time.  I think we'll eventually find it useful to run
shutdown callbacks in other cases as well - e.g. when a Limit has been
filled.  It's probably true that the undeniable need for this today is
in the parallel query case, but I'd like to have this text read in a
way that doesn't assert that this is the only possible use case,
because I don't think it is.

I rewrote the documentation along those lines a bit and committed this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company