Обсуждение: Rethinking behavior of force_parallel_mode = regress

Поиск
Список
Период
Сортировка

Rethinking behavior of force_parallel_mode = regress

От
Tom Lane
Дата:
As of HEAD it is possible to get through all of our regression tests
with these settings:

alter system set force_parallel_mode = regress;
alter system set max_parallel_workers_per_gather = 2;
alter system set parallel_tuple_cost = 0;
alter system set parallel_setup_cost = 0;
alter system set min_parallel_relation_size = 0;

although there are quite a number of cosmetic differences in the outputs
for the core regression tests.  (Curiously, contrib, pl, and isolation
seem to pass without any diffs.)  In view of the number of bugs we've been
able to identify with this setup, it would be nice to reduce the volume of
the cosmetic differences to make it easier to review the diffs by hand.
I'm not sure there's much that can be done about the row-ordering diffs;
some randomness in the output order from a parallel seqscan seems
inevitable.  But we could tamp down the EXPLAIN output differences, which
are much harder to review anyway.

With that thought in mind, I propose that the behavior of
force_parallel_mode = regress is ill-designed so far as EXPLAIN is
concerned.  What it ought to do is suppress *all* Gathers from the output,
not just ones that were added in response to force_parallel_mode itself.

I experimented with the attached prototype patch and found that it indeed
greatly reduces the volume of EXPLAIN differences, though it doesn't
remove them all.  I did not for example try to hide the effects of partial
aggregation.

If we were to do this, we could remove the Gather.invisible plan node
field altogether, which would be good cleanup in my book.

Even if we don't do this, my code review showed that there's a bug in
what ExplainPrintPlan is doing right now for the case: it neglects to
run InstrEndLoop on the topmost node, which at the very least risks
confusing auto_explain.

Thoughts?

            regards, tom lane

diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 379fc5c..b4b2705 100644
*** a/src/backend/commands/explain.c
--- b/src/backend/commands/explain.c
*************** void
*** 574,580 ****
  ExplainPrintPlan(ExplainState *es, QueryDesc *queryDesc)
  {
      Bitmapset  *rels_used = NULL;
-     PlanState  *ps;

      Assert(queryDesc->plannedstmt != NULL);
      es->pstmt = queryDesc->plannedstmt;
--- 574,579 ----
*************** ExplainPrintPlan(ExplainState *es, Query
*** 583,599 ****
      es->rtable_names = select_rtable_names_for_explain(es->rtable, rels_used);
      es->deparse_cxt = deparse_context_for_plan_rtable(es->rtable,
                                                        es->rtable_names);
!
!     /*
!      * Sometimes we mark a Gather node as "invisible", which means that it's
!      * not displayed in EXPLAIN output.  The purpose of this is to allow
!      * running regression tests with force_parallel_mode=regress to get the
!      * same results as running the same tests with force_parallel_mode=off.
!      */
!     ps = queryDesc->planstate;
!     if (IsA(ps, GatherState) &&((Gather *) ps->plan)->invisible)
!         ps = outerPlanState(ps);
!     ExplainNode(ps, NIL, NULL, NULL, es);
  }

  /*
--- 582,588 ----
      es->rtable_names = select_rtable_names_for_explain(es->rtable, rels_used);
      es->deparse_cxt = deparse_context_for_plan_rtable(es->rtable,
                                                        es->rtable_names);
!     ExplainNode(queryDesc->planstate, NIL, NULL, NULL, es);
  }

  /*
*************** ExplainNode(PlanState *planstate, List *
*** 812,817 ****
--- 801,831 ----
      int            save_indent = es->indent;
      bool        haschildren;

+     /*
+      * In force_parallel_mode = regress mode, we want to hide Gather nodes,
+      * and just show their children.  But don't do that in EXPLAIN ANALYZE,
+      * nor if any initplans or subplans got attached to the Gather, as
+      * omitting the Gather would produce inconsistent results then.
+      */
+     if (force_parallel_mode == FORCE_PARALLEL_REGRESS &&
+         !es->analyze &&
+         IsA(plan, Gather) &&
+         planstate->initPlan == NULL &&
+         planstate->subPlan == NULL)
+     {
+         /* keep contrib/auto_explain happy, per comments below */
+         if (planstate->instrument)
+             InstrEndLoop(planstate->instrument);
+         /* adjust ancestor list properly for recursion */
+         ancestors = lcons(planstate, ancestors);
+         /* recurse, passing down same relationship/plan_name */
+         ExplainNode(outerPlanState(planstate), ancestors,
+                     relationship, plan_name, es);
+         /* undo destructive change to ancestor list */
+         ancestors = list_delete_first(ancestors);
+         return;
+     }
+
      switch (nodeTag(plan))
      {
          case T_Result:
*************** ExplainNode(PlanState *planstate, List *
*** 1032,1038 ****
              appendStringInfoString(es->str, "->  ");
              es->indent += 2;
          }
!         if (plan->parallel_aware)
              appendStringInfoString(es->str, "Parallel ");
          appendStringInfoString(es->str, pname);
          es->indent++;
--- 1046,1053 ----
              appendStringInfoString(es->str, "->  ");
              es->indent += 2;
          }
!         if (plan->parallel_aware &&
!             force_parallel_mode != FORCE_PARALLEL_REGRESS)
              appendStringInfoString(es->str, "Parallel ");
          appendStringInfoString(es->str, pname);
          es->indent++;

Re: Rethinking behavior of force_parallel_mode = regress

От
Robert Haas
Дата:
On Sat, Jun 18, 2016 at 4:49 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> As of HEAD it is possible to get through all of our regression tests
> with these settings:
>
> alter system set force_parallel_mode = regress;
> alter system set max_parallel_workers_per_gather = 2;
> alter system set parallel_tuple_cost = 0;
> alter system set parallel_setup_cost = 0;
> alter system set min_parallel_relation_size = 0;
>
> although there are quite a number of cosmetic differences in the outputs
> for the core regression tests.  (Curiously, contrib, pl, and isolation
> seem to pass without any diffs.)  In view of the number of bugs we've been
> able to identify with this setup, it would be nice to reduce the volume of
> the cosmetic differences to make it easier to review the diffs by hand.
> I'm not sure there's much that can be done about the row-ordering diffs;
> some randomness in the output order from a parallel seqscan seems
> inevitable.  But we could tamp down the EXPLAIN output differences, which
> are much harder to review anyway.
>
> With that thought in mind, I propose that the behavior of
> force_parallel_mode = regress is ill-designed so far as EXPLAIN is
> concerned.  What it ought to do is suppress *all* Gathers from the output,
> not just ones that were added in response to force_parallel_mode itself.

No, that doesn't sound like a very good idea.  If you do that, then
you have no hope of the differences being *zero*, because any place
that the regression tests are intended to produce a parallel plan is
going to look different.  The charter of force_parallel_mode=regress
is that any regression test that passes normally should still pass
with that setting.  This change would clearly break that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Rethinking behavior of force_parallel_mode = regress

От
Tom Lane
Дата:
Robert Haas <robertmhaas@gmail.com> writes:
> On Sat, Jun 18, 2016 at 4:49 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> With that thought in mind, I propose that the behavior of
>> force_parallel_mode = regress is ill-designed so far as EXPLAIN is
>> concerned.  What it ought to do is suppress *all* Gathers from the output,
>> not just ones that were added in response to force_parallel_mode itself.

> No, that doesn't sound like a very good idea.  If you do that, then
> you have no hope of the differences being *zero*, because any place
> that the regression tests are intended to produce a parallel plan is
> going to look different.

Well, sure, but in those areas you just set force_parallel_mode to on.

> The charter of force_parallel_mode=regress
> is that any regression test that passes normally should still pass
> with that setting.

I would like that charter to include scenarios with other nondefault GUC
settings, to the extent we can reasonably make it work, because setting
*only* force_parallel_mode is pretty sad in terms of test coverage.
Or hadn't you noticed all the bugs we flushed from cover as soon as we
tried changing the cost values?
        regards, tom lane



Re: Rethinking behavior of force_parallel_mode = regress

От
Robert Haas
Дата:
On Tue, Jun 21, 2016 at 4:41 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Sat, Jun 18, 2016 at 4:49 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> With that thought in mind, I propose that the behavior of
>>> force_parallel_mode = regress is ill-designed so far as EXPLAIN is
>>> concerned.  What it ought to do is suppress *all* Gathers from the output,
>>> not just ones that were added in response to force_parallel_mode itself.
>
>> No, that doesn't sound like a very good idea.  If you do that, then
>> you have no hope of the differences being *zero*, because any place
>> that the regression tests are intended to produce a parallel plan is
>> going to look different.
>
> Well, sure, but in those areas you just set force_parallel_mode to on.

Well, I don't see how that gets you anywhere.  Now every regression
test that generates a parallel plan needs a decoration to set
force_parallel_mode=on temporarily and then change it back to regress
afterwards.  And once you've done that, you no longer get any benefit
out of having changed the behavior of force_parallel_mode=regress.
Either I need more caffeine, or this is a bad plan.  Possibly both,
because I definitely need more caffeine.

>> The charter of force_parallel_mode=regress
>> is that any regression test that passes normally should still pass
>> with that setting.
>
> I would like that charter to include scenarios with other nondefault GUC
> settings, to the extent we can reasonably make it work, because setting
> *only* force_parallel_mode is pretty sad in terms of test coverage.
> Or hadn't you noticed all the bugs we flushed from cover as soon as we
> tried changing the cost values?

Well, I did send a WIP patch to set consider_parallel correctly for
upper rels, which helps a lot, but you seem not to have looked at it.
I think we should fix the bugs in the current approach before deciding
that it doesn't work.  That having been said, I can't disagree with
the principle you're setting forth here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company