Обсуждение: make add_paths_to_append_rel aware of startup cost

Поиск
Список
Период
Сортировка

make add_paths_to_append_rel aware of startup cost

От
Andy Fan
Дата:
Hi:

This thread is a refactor of thread [1] for easier communication.

Currently add_paths_to_append_rel overlooked the startup cost for creating
append path, so it may have lost some optimization chances.  After a glance, 
the following 4 identifiers can be impacted.

Identifier 1:

SELECT .. FROM v1
UNION ALL
SELECT .. FROM v2
LIMIT 3;

Identifier 2:

SELECT * FROM p .. LIMIT 3;  p is a partitioned table.

Identifier 3:
SELECT * FROM p JOIN q using (partkey) LIMIT 3;

If we did the partition-wise-join, then we lost the chances for a better
plan.

Identifier 4:  -- EXISTS implies LIMIT 1;
SELECT * FROM foo
WHERE EXISTS
(SELECT 1 FROM append_rel_v_not_pullup_able WHERE xxx);

However, after I completed my patch and wanted to build some real 
queries to prove my idea,  I just find it is hard to build the case for 
Identifier 2/3/4.  But the Improvement for Identifier 1 is easy and 
my real user case in work is Identifier 1 as well.

So a patch is attached for this case, it will use fractional costs
rather than total costs if needed. The following items needs more
attention during development. 

- We shouldn't do the optimization if there are still more tables to join,
  the reason is similar to has_multiple_baserels(root) in
  set_subquery_pathlist. But the trouble here is we may inherit multiple
  levels to build an appendrel, so I have to keep the 'top_relids' all the
  time and compare it with PlannerInfo.all_baserels. If they are the same,
  then it is the case we want to optimize.

- add_partial_path doesn't consider the startup costs by design, I didn't
  rethink too much about this, but the idea of "choose a path which
  let each worker produces the top-N tuples as fast as possible" looks
  reasonable, and even though add_partial_path doesn't consider startup
  cost, it is still possible that childrel keeps more than 1 partial paths
  due to any reasons except startup_cost, for example PathKey. then we
  still have chances to choose the cheapest fractional path among
  them. The same strategy also applies to mixed partial and non-partial
  append paths.

- Due to the complexity of add_paths_to_append_rel, 3 arguments have
 to be added to get_cheapest_fractional_path...

  Path *
get_cheapest_fractional_path(RelOptInfo *rel, double tuple_fraction,
bool allow_parameterized, bool look_partial,
bool must_parallel_safe)


Cases can be improved.

Here is the simplest test case,  but it will not be hard to provide more
cases for Identifier 1. 

 (select * from tenk1 order by hundred)
 UNION ALL
 (select * from tenk1 order by hundred)
 limit 3;

master: 8.096ms.
patched: 0.204ms.

The below user case should be more reasonable for real business. 

with a as (select * from t1 join t2..),
b as (select * from t1 join t3 ..)
select * from a union all select * from b
limit 3;

The patch would also have impacts on identifier 2/3/4, even though I can't
make a demo sql which can get benefits from this patch,  I also  added
some test cases for code coverage purposes.
Вложения

Re: make add_paths_to_append_rel aware of startup cost

От
Andy Fan
Дата:
 
- We shouldn't do the optimization if there are still more tables to join,
  the reason is similar to has_multiple_baserels(root) in
  set_subquery_pathlist. 

After some internal discussion, we have 2 different choices here. Let's
call the current choice as method-a, and the new choice as method-b.
Method-b will just ignore the "no more tables to join "limitation
and build the append path with both cheapest startup cost and cheapest
total cost, this is pretty like the method we joins a plain relation with
another relation. The uneasy part is it is the cheapest start up cost
rather than the cheapest fractional cost.

method-a is pretty same as what set_subquery_pathlist is doing, which has
a limitation on "no more tables to join" and has no the "cheapest startup
cost" part.

Ideally we can apply both strategies if we don't consider the effort. If
there are no more tables to join, we use method-a. otherwise use 
method-b. With this thinking, we can even apply the same strategy to plain
relations as well.

However, I am not sure if the "cheapest startup cost" is a real problem. 
If it is not, we can apply method-b directly and modify 
set_subquery_pathlist to do the same for consistency.  


--
Best Regards
Andy Fan

Re: make add_paths_to_append_rel aware of startup cost

От
David Rowley
Дата:
On Thu, 7 Sept 2023 at 04:37, Andy Fan <zhihui.fan1213@gmail.com> wrote:
> Currently add_paths_to_append_rel overlooked the startup cost for creating
> append path, so it may have lost some optimization chances.  After a glance,
> the following 4 identifiers can be impacted.

> - We shouldn't do the optimization if there are still more tables to join,
>   the reason is similar to has_multiple_baserels(root) in
>   set_subquery_pathlist. But the trouble here is we may inherit multiple
>   levels to build an appendrel, so I have to keep the 'top_relids' all the
>   time and compare it with PlannerInfo.all_baserels. If they are the same,
>   then it is the case we want to optimize.

I think you've likely gone to the trouble of trying to determine if
there are any joins pending because you're considering using a cheap
startup path *instead* of the cheapest total path and you don't want
to do that when some join will cause all the rows to be read thus
making the plan more expensive if a cheap startup path was picked.

Instead of doing that, why don't you just create a completely new
AppendPath containing all the cheapest_startup_paths and add that to
the append rel. You can optimise this and only do it when
rel->consider_startup is true.

Does the attached do anything less than what your patch does?

David

Вложения

Re: make add_paths_to_append_rel aware of startup cost

От
Andy Fan
Дата:
Hi David,
  Thanks for taking a look at this!

On Fri, Sep 15, 2023 at 3:15 PM David Rowley <dgrowleyml@gmail.com> wrote:
On Thu, 7 Sept 2023 at 04:37, Andy Fan <zhihui.fan1213@gmail.com> wrote:
> Currently add_paths_to_append_rel overlooked the startup cost for creating
> append path, so it may have lost some optimization chances.  After a glance,
> the following 4 identifiers can be impacted.

> - We shouldn't do the optimization if there are still more tables to join,
>   the reason is similar to has_multiple_baserels(root) in
>   set_subquery_pathlist. But the trouble here is we may inherit multiple
>   levels to build an appendrel, so I have to keep the 'top_relids' all the
>   time and compare it with PlannerInfo.all_baserels. If they are the same,
>   then it is the case we want to optimize.

I think you've likely gone to the trouble of trying to determine if
there are any joins pending because you're considering using a cheap
startup path *instead* of the cheapest total path and you don't want
to do that when some join will cause all the rows to be read thus
making the plan more expensive if a cheap startup path was picked.

Yes, that's true.  However it is not something we can't resolve, one
of the solutions is just like what I did in the patch.  but currently the
main stuff which confuses me is if it is the right thing to give up the
optimization if it has more tables to join (just like set_subquery_pathlist
did). 
 
Instead of doing that, why don't you just create a completely new
AppendPath containing all the cheapest_startup_paths and add that to
the append rel. You can optimise this and only do it when
rel->consider_startup is true.

Does the attached do anything less than what your patch does?

We can work like this,  but  there is one difference from what
my current patch does. It is cheapest_startup_path vs cheapest
fraction path.  For example if we have the following 3 paths with
all of the estimated rows is 100 and the tuple_fraction is 10. 

Path 1:  startup_cost = 60,  total_cost = 80  -- cheapest total cost. 
Path 2:  startup_cost = 10,  total_cost = 1000  -- cheapest startup cost
Path 3:  startup_cost = 20,  total_cost = 90  -- cheapest fractional cost 

So with the patch you propose,  Path 1 & Path 3 are chosen to build
append path.  but with my current patch,  Only path 3 is kept.  IIUC, 
path 3 should be the best one in this case.

We might also argue why Path 3 is kept in the first place (the children
level), I think pathkey might be one option.  and even path 3 is 
discarded somehow, I think only if it is the best one, we should
keep it ideally. 

Another tiny factor of this is your propose isn't consistent with
what set_subquery_pathlist which uses cheapest fractional cost
and my proposal isn't consistent plain rel which uses cheapest
startup cost.  We can't say which one is better, though. 

If my above analysis is correct,  I think the best way to handle this
is if there is no more tables to join, we use cheapest fraction cost
for all the kinds of relations, including plain relation, append rel,
subquery and so on.  If we have more tables to join,  we use
cheapest startup cost.  On the implementation side, I want to use
RelOptInfo.tuple_fraction instead of RelOptInfo.consider_startup. 
tuple_fraction = -1 means startup cost should not be considered.
tuple_fraction = 0 means cheapest startup cost should be used. 
tuple_franction > 0 means cheapest fraction cost should be used.

I still don't pay enough attention to consider_param_startup in
RelOptInfo,  I'm feeling the above strategy will not generate
too much overhead to the planner for now while it can provides
a better plan sometimes. 

--
Best Regards
Andy Fan

Re: make add_paths_to_append_rel aware of startup cost

От
David Rowley
Дата:
On Mon, 18 Sept 2023 at 01:42, Andy Fan <zhihui.fan1213@gmail.com> wrote:
> On Fri, Sep 15, 2023 at 3:15 PM David Rowley <dgrowleyml@gmail.com> wrote:
>> Instead of doing that, why don't you just create a completely new
>> AppendPath containing all the cheapest_startup_paths and add that to
>> the append rel. You can optimise this and only do it when
>> rel->consider_startup is true.
>>
>> Does the attached do anything less than what your patch does?
>
>
> We can work like this,  but  there is one difference from what
> my current patch does. It is cheapest_startup_path vs cheapest
> fraction path.  For example if we have the following 3 paths with
> all of the estimated rows is 100 and the tuple_fraction is 10.

Yeah, it's true that the version I wrote didn't consider the
fractional part, but I didn't really see it as worse than what you
did.  It looks like you're assuming that every append child will have
the same number of tuples read from it, but it seems to me that it
would only be valid to use the fractional part for the first child.
The path chosen for subsequent child paths would, if done correctly,
need to account for the estimated rows from the previous child paths.
It's not valid here to copy the code in generate_orderedappend_paths()
as MergeAppend won't necessarily exhaust the first child subpath first
like Append will.

> Path 1:  startup_cost = 60,  total_cost = 80  -- cheapest total cost.
> Path 2:  startup_cost = 10,  total_cost = 1000  -- cheapest startup cost
> Path 3:  startup_cost = 20,  total_cost = 90  -- cheapest fractional cost
>
> So with the patch you propose,  Path 1 & Path 3 are chosen to build
> append path.  but with my current patch,  Only path 3 is kept.  IIUC,
> path 3 should be the best one in this case.

I assume you mean mine would build AppendPaths for 1+2, not 1+3.

You mentioned:

> I just find it is hard to build the case for Identifier 2/3/4.

I wonder if this is because generate_orderedappend_paths() handles
startup paths and most cases will that need a cheap startup plan will
require some sort of pathkeys.

The example you mentioned of:

(select * from tenk1 order by hundred)
 UNION ALL
 (select * from tenk1 order by hundred)
 limit 3;

I don't find this to be a compellingly real-world case.  The planner
is under no obligation to output rows from the 1st branch of the UNION
ALL before the 2nd one. If the user cared about that then they'd have
instead added a top-level ORDER BY, in which case the planner seems
happy to use the index scan:

regression=# explain (costs off) (select * from tenk1) UNION ALL
(select * from tenk1) order by hundred limit 3;
                         QUERY PLAN
-------------------------------------------------------------
 Limit
   ->  Merge Append
         Sort Key: tenk1.hundred
         ->  Index Scan using tenk1_hundred on tenk1
         ->  Index Scan using tenk1_hundred on tenk1 tenk1_1

It would be good if you could provide a bit more detail on the cases
you want to improve here.  For example, if your #4 case, you have
"WHERE xxx". I don't know if "xxx" is just a base qual or if there's a
correlation to the outer query in there.

Another concern I have with your patch is that it seems to be under
the impression that there being further joins to evaluate at this
query level is the only reason that we would have to pull more than
the tuple fraction number of rows from the query. What gives you the
confidence that's the only reason we may want to pull more than the
tuple fraction of tuples from the append child?

David



Re: make add_paths_to_append_rel aware of startup cost

От
Andy Fan
Дата:


On Mon, Sep 18, 2023 at 11:58 AM David Rowley <dgrowleyml@gmail.com> wrote:
On Mon, 18 Sept 2023 at 01:42, Andy Fan <zhihui.fan1213@gmail.com> wrote:
> On Fri, Sep 15, 2023 at 3:15 PM David Rowley <dgrowleyml@gmail.com> wrote:
>> Instead of doing that, why don't you just create a completely new
>> AppendPath containing all the cheapest_startup_paths and add that to
>> the append rel. You can optimise this and only do it when
>> rel->consider_startup is true.
>>
>> Does the attached do anything less than what your patch does?
>
>
> We can work like this,  but  there is one difference from what
> my current patch does. It is cheapest_startup_path vs cheapest
> fraction path.  For example if we have the following 3 paths with
> all of the estimated rows is 100 and the tuple_fraction is 10.

Yeah, it's true that the version I wrote didn't consider the
fractional part, but I didn't really see it as worse than what you
did. It looks like you're assuming that every append child will have
the same number of tuples read from it, but it seems to me that it
would only be valid to use the fractional part for the first child. 
The path chosen for subsequent child paths would, if done correctly,
need to account for the estimated rows from the previous child paths.

Actually this is  consistent with what generate_union_paths does now.

 /*
* If plain UNION, tell children to fetch all tuples.
*
* Note: in UNION ALL, we pass the top-level tuple_fraction unmodified to
* each arm of the UNION ALL.  One could make a case for reducing the
* tuple fraction for later arms (discounting by the expected size of the
* earlier arms' results) but it seems not worth the trouble. The normal
* case where tuple_fraction isn't already zero is a LIMIT at top level,
* and passing it down as-is is usually enough to get the desired result
* of preferring fast-start plans.
*/
if (!op->all)
root->tuple_fraction = 0.0;

UNION ALL is pretty like append rel. 


It's not valid here to copy the code in generate_orderedappend_paths()
as MergeAppend won't necessarily exhaust the first child subpath first
like Append will.

Not sure which code you are referring to,  but the code I refer to
much is generate_union_paths and set_subquery_pathlist. 
 
> Path 1:  startup_cost = 60,  total_cost = 80  -- cheapest total cost.
> Path 2:  startup_cost = 10,  total_cost = 1000  -- cheapest startup cost
> Path 3:  startup_cost = 20,  total_cost = 90  -- cheapest fractional cost
>
> So with the patch you propose,  Path 1 & Path 3 are chosen to build
> append path.  but with my current patch,  Only path 3 is kept.  IIUC,
> path 3 should be the best one in this case.

I assume you mean mine would build AppendPaths for 1+2, not 1+3.

Yes, it should be 1+2.  
 
You mentioned:

> I just find it is hard to build the case for Identifier 2/3/4.

I wonder if this is because generate_orderedappend_paths() handles
startup paths and most cases will that need a cheap startup plan will
require some sort of pathkeys.

Probably yes.  
 
The example you mentioned of:

(select * from tenk1 order by hundred)
 UNION ALL
 (select * from tenk1 order by hundred)
 limit 3;

I don't find this to be a compellingly real-world case.  The planner
is under no obligation to output rows from the 1st branch of the UNION
ALL before the 2nd one. If the user cared about that then they'd have
instead added a top-level ORDER BY, in which case the planner seems
happy to use the index scan:

Sorry about the test case,  here is the one with more compelling 
real-world.

with s1 as (select * from tenk1 join tenk2 using (hundred)), 
s2 as (select * from tenk1 join tenk2 using (hundred)) 
select * from s1
union all 
select * from s2 
limit 3;

It would be good if you could provide a bit more detail on the cases
you want to improve here.  For example, if your #4 case, you have
"WHERE xxx". I don't know if "xxx" is just a base qual or if there's a
correlation to the outer query in there.

for the #4, the quickest test case is

select * from tenk1 where exists
(
with s1 as (select * from tenk1 join tenk2 using (hundred)), 
s2 as (select * from tenk1 join tenk2 using (hundred)) 
select * from s1
union all 
select * from s2 
where random() > 0.4);

random() is used to make it can't be pull-up.   and exists implies 
LIMIT 1; 
 

Another concern I have with your patch is that it seems to be under
the impression that there being further joins to evaluate at this
query level is the only reason that we would have to pull more than
the tuple fraction number of rows from the query. What gives you the
confidence that's the only reason we may want to pull more than the
tuple fraction of tuples from the append child?

I think you are talking about something like ORDER BY, GROUP BY 
clause, I  do overlook it.  but if we admit cheapest fractional cost
is a right factor to consider,  this issue is not unresolvable since parse
is at hand. 

At last, you ignore the part of set_subquery_pathlist. I always use it
to prove the value of the cheapest fractional cost.  Am I missing something?

        /*
         * We can safely pass the outer tuple_fraction down to the subquery if the
         * outer level has no joining, aggregation, or sorting to do. Otherwise
         * we'd better tell the subquery to plan for full retrieval. (XXX This
         * could probably be made more intelligent ...)
         */
        if (parse->hasAggs ||
                parse->groupClause ||
                parse->groupingSets ||
                root->hasHavingQual ||
                parse->distinctClause ||
                parse->sortClause ||
                has_multiple_baserels(root))
                tuple_fraction = 0.0;   /* default case */
        else
                tuple_fraction = root->tuple_fraction;

What do you think about this in my last reply?  "If my above 
analysis is correct,  I think the best way to  handle this is if there 
is no more tables to join, we use cheapest  fraction cost for all 
the kinds of relations,  including plain relation,  append rel, 
subquery and so on, If we have more tables to join,  we use
cheapest startup cost.".  This is what is in my mind now. 

--
Best Regards
Andy Fan

Re: make add_paths_to_append_rel aware of startup cost

От
Andy Fan
Дата:
Hi,
 
What do you think about this in my last reply?  "If my above 
analysis is correct,  I think the best way to  handle this is if there 
is no more tables to join, we use cheapest  fraction cost for all 
the kinds of relations,  including plain relation,  append rel, 
subquery and so on, If we have more tables to join,  we use
cheapest startup cost.".  This is what is in my mind now. 

 
Here is an updated version to show what's in my mind. 

--
Best Regards
Andy Fan
Вложения

Re: make add_paths_to_append_rel aware of startup cost

От
David Rowley
Дата:
On Mon, 18 Sept 2023 at 22:55, Andy Fan <zhihui.fan1213@gmail.com> wrote:
> Here is an updated version to show what's in my mind.

My current thoughts on this are that the fractional cost part adds
quite a bit more complexity than the minimalistic approach of just
also considering the cheapest startup path.

There's also quite a bit I don't like about the extra code you've added:

1. RelOptInfo.tuple_fraction is not given a default value in locations
where we do makeNode(RelOptInfo);

2. This is very poorly documented and badly named. Also seems to have
a typo "stopper"

+ /* Like order by, group by, distinct and so. */
+ bool has_stoper_op;

With that, nobody has a hope of knowing if some new operation should
set that value to true or false.

I think it needs to define the meaning, which I think (very roughly)
is "does the query require any additional upper-planner operations
which could require having to read more tuples from the final join
relation than the number of tuples which are read from the final upper
rel."

3. get_fractional_path_cost() goes to the trouble of setting
total_rows then does not use it.

4. I don't see why it's ok to take the total_rows from the first Path
in the list in get_cheapest_fractional_path_ext(). What if another
Path has some other value?

But overall, I'm more inclined to just go with the more simple "add a
cheap unordered startup append path if considering cheap startup
plans" version. I see your latest patch does both. So, I'd suggest two
patches as I do see the merit in keeping this simple and cheap.  If we
can get the first part in and you still find cases where you're not
getting the most appropriate startup plan based on the tuple fraction,
then we can reconsider what extra complexity we should endure in the
code based on the example query where we've demonstrated the planner
is not choosing the best startup path appropriate to the given tuple
fraction.

David



Re: make add_paths_to_append_rel aware of startup cost

От
Andy Fan
Дата:
 Hi David, 

But overall, I'm more inclined to just go with the more simple "add a
cheap unordered startup append path if considering cheap startup
plans" version. I see your latest patch does both. So, I'd suggest two
patches as I do see the merit in keeping this simple and cheap.  If we
can get the first part in and you still find cases where you're not
getting the most appropriate startup plan based on the tuple fraction,
then we can reconsider what extra complexity we should endure in the
code based on the example query where we've demonstrated the planner
is not choosing the best startup path appropriate to the given tuple
fraction.

I think this is a fair point,  I agree that your first part is good enough to be 
committed first.   Actually I tried a lot to make a test case which can  prove
the value of cheapest fractional cost but no gain so far:(  

--
Best Regards
Andy Fan

Re: make add_paths_to_append_rel aware of startup cost

От
David Rowley
Дата:
On Sun, 1 Oct 2023 at 21:26, Andy Fan <zhihui.fan1213@gmail.com> wrote:
>> But overall, I'm more inclined to just go with the more simple "add a
>> cheap unordered startup append path if considering cheap startup
>> plans" version. I see your latest patch does both. So, I'd suggest two
>> patches as I do see the merit in keeping this simple and cheap.  If we
>> can get the first part in and you still find cases where you're not
>> getting the most appropriate startup plan based on the tuple fraction,
>> then we can reconsider what extra complexity we should endure in the
>> code based on the example query where we've demonstrated the planner
>> is not choosing the best startup path appropriate to the given tuple
>> fraction.
>
> I think this is a fair point,  I agree that your first part is good enough to be
> committed first.   Actually I tried a lot to make a test case which can  prove
> the value of cheapest fractional cost but no gain so far:(

I've attached a patch with the same code as the previous patch but
this time including a regression test.

I see no reason to not commit this so if anyone feels differently
please let me know.

David

Вложения

Re: make add_paths_to_append_rel aware of startup cost

От
Andy Fan
Дата:


On Wed, Oct 4, 2023 at 8:41 AM David Rowley <dgrowleyml@gmail.com> wrote:
On Sun, 1 Oct 2023 at 21:26, Andy Fan <zhihui.fan1213@gmail.com> wrote:
>> But overall, I'm more inclined to just go with the more simple "add a
>> cheap unordered startup append path if considering cheap startup
>> plans" version. I see your latest patch does both. So, I'd suggest two
>> patches as I do see the merit in keeping this simple and cheap.  If we
>> can get the first part in and you still find cases where you're not
>> getting the most appropriate startup plan based on the tuple fraction,
>> then we can reconsider what extra complexity we should endure in the
>> code based on the example query where we've demonstrated the planner
>> is not choosing the best startup path appropriate to the given tuple
>> fraction.
>
> I think this is a fair point,  I agree that your first part is good enough to be
> committed first.   Actually I tried a lot to make a test case which can  prove
> the value of cheapest fractional cost but no gain so far:(

I've attached a patch with the same code as the previous patch but
this time including a regression test.

I see no reason to not commit this so if anyone feels differently
please let me know.

David

Patch LGTM.

--
Best Regards
Andy Fan

Re: make add_paths_to_append_rel aware of startup cost

От
David Rowley
Дата:
On Thu, 5 Oct 2023 at 14:11, Andy Fan <zhihui.fan1213@gmail.com> wrote:
> Patch LGTM.

Thanks. Pushed.

David



Re: make add_paths_to_append_rel aware of startup cost

От
Thomas Munro
Дата:
On Thu, Oct 5, 2023 at 9:07 PM David Rowley <dgrowleyml@gmail.com> wrote:
> Thanks. Pushed.

FYI somehow this plan from a8a968a8212e flipped in this run:

=== dumping /home/bf/bf-build/mylodon/HEAD/pgsql.build/testrun/recovery/027_stream_regress/data/regression.diffs
===
diff -U3 /home/bf/bf-build/mylodon/HEAD/pgsql/src/test/regress/expected/union.out
/home/bf/bf-build/mylodon/HEAD/pgsql.build/testrun/recovery/027_stream_regress/data/results/union.out
--- /home/bf/bf-build/mylodon/HEAD/pgsql/src/test/regress/expected/union.out
2024-01-15 00:31:13.947555940 +0000
+++ /home/bf/bf-build/mylodon/HEAD/pgsql.build/testrun/recovery/027_stream_regress/data/results/union.out
2024-02-14 00:06:17.075584839 +0000
@@ -1447,9 +1447,9 @@
    ->  Append
          ->  Nested Loop
                Join Filter: (t1.tenthous = t2.tenthous)
-               ->  Seq Scan on tenk1 t1
+               ->  Seq Scan on tenk2 t2
                ->  Materialize
-                     ->  Seq Scan on tenk2 t2
+                     ->  Seq Scan on tenk1 t1
          ->  Result
 (8 rows)

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mylodon&dt=2024-02-14%2000%3A01%3A03



Re: make add_paths_to_append_rel aware of startup cost

От
Andy Fan
Дата:
Thomas Munro <thomas.munro@gmail.com> writes:

> On Thu, Oct 5, 2023 at 9:07 PM David Rowley <dgrowleyml@gmail.com> wrote:
>> Thanks. Pushed.
>
> FYI somehow this plan from a8a968a8212e flipped in this run:
>
> === dumping /home/bf/bf-build/mylodon/HEAD/pgsql.build/testrun/recovery/027_stream_regress/data/regression.diffs
> ===
> diff -U3 /home/bf/bf-build/mylodon/HEAD/pgsql/src/test/regress/expected/union.out
> /home/bf/bf-build/mylodon/HEAD/pgsql.build/testrun/recovery/027_stream_regress/data/results/union.out
> --- /home/bf/bf-build/mylodon/HEAD/pgsql/src/test/regress/expected/union.out
> 2024-01-15 00:31:13.947555940 +0000
> +++ /home/bf/bf-build/mylodon/HEAD/pgsql.build/testrun/recovery/027_stream_regress/data/results/union.out
> 2024-02-14 00:06:17.075584839 +0000
> @@ -1447,9 +1447,9 @@
>     ->  Append
>           ->  Nested Loop
>                 Join Filter: (t1.tenthous = t2.tenthous)
> -               ->  Seq Scan on tenk1 t1
> +               ->  Seq Scan on tenk2 t2
>                 ->  Materialize
> -                     ->  Seq Scan on tenk2 t2
> +                     ->  Seq Scan on tenk1 t1
>           ->  Result
>  (8 rows)
>
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mylodon&dt=2024-02-14%2000%3A01%3A03

Thanks for this information! I will take a look at this.

--
Best Regards
Andy Fan




Re: make add_paths_to_append_rel aware of startup cost

От
Andy Fan
Дата:
Andy Fan <zhihuifan1213@163.com> writes:

> Thomas Munro <thomas.munro@gmail.com> writes:
>
>> On Thu, Oct 5, 2023 at 9:07 PM David Rowley <dgrowleyml@gmail.com> wrote:
>>> Thanks. Pushed.
>>
>> FYI somehow this plan from a8a968a8212e flipped in this run:
>>
>> === dumping /home/bf/bf-build/mylodon/HEAD/pgsql.build/testrun/recovery/027_stream_regress/data/regression.diffs
>> ===
>> diff -U3 /home/bf/bf-build/mylodon/HEAD/pgsql/src/test/regress/expected/union.out
>> /home/bf/bf-build/mylodon/HEAD/pgsql.build/testrun/recovery/027_stream_regress/data/results/union.out
>> --- /home/bf/bf-build/mylodon/HEAD/pgsql/src/test/regress/expected/union.out
>> 2024-01-15 00:31:13.947555940 +0000
>> +++ /home/bf/bf-build/mylodon/HEAD/pgsql.build/testrun/recovery/027_stream_regress/data/results/union.out
>> 2024-02-14 00:06:17.075584839 +0000
>> @@ -1447,9 +1447,9 @@
>>     ->  Append
>>           ->  Nested Loop
>>                 Join Filter: (t1.tenthous = t2.tenthous)
>> -               ->  Seq Scan on tenk1 t1
>> +               ->  Seq Scan on tenk2 t2
>>                 ->  Materialize
>> -                     ->  Seq Scan on tenk2 t2
>> +                     ->  Seq Scan on tenk1 t1
>>           ->  Result
>>  (8 rows)
>>
>> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mylodon&dt=2024-02-14%2000%3A01%3A03
>
> Thanks for this information! I will take a look at this.

I found the both plans have the same cost, I can't get the accurate
cause of this after some hours research, but it is pretty similar with
7516056c584e3, so I uses a similar strategy to stable it. is it
acceptable?

--
Best Regards
Andy Fan


Вложения

Re: make add_paths_to_append_rel aware of startup cost

От
David Rowley
Дата:
On Thu, 15 Feb 2024 at 21:42, Andy Fan <zhihuifan1213@163.com> wrote:
> I found the both plans have the same cost, I can't get the accurate
> cause of this after some hours research, but it is pretty similar with
> 7516056c584e3, so I uses a similar strategy to stable it. is it
> acceptable?

It's pretty hard to say.  I can only guess why this test would be
flapping like this. I see it's happened before on mylodon, so probably
not a cosmic ray.  It's not like add_path() chooses a random path when
the costs are the same, so I wondered if something similar is going on
here that was going on that led to f03a9ca4. In particular, see [1].

On master, I set a breakpoint in try_nestloop_path() to break on
"outerrel->relid==1 && innerrel->relid==2". I see the total Nested
Loop cost comes out the same with the join order reversed.

Which is:

 ->  Nested Loop  (cost=0.00..1500915.00 rows=10000 width=4)

Doing the same with your patch applied, I get:

->  Nested Loop  (cost=0.00..600925.00 rows=4000 width=4)

and forcing the join order to swap with the debugger, I see:

->  Nested Loop  (cost=0.00..600940.00 rows=4000 width=4)

So there's a difference now, but it's quite small. If it was a problem
like we had on [1], then since tenk1 and tenk2 have 345 pages (on my
machine), if relpages is down 1 or 2 pages, we'll likely get more of a
costing difference than 600925 vs 600940.

If I use:

explain
select t1.unique1 from tenk1 t1
inner join tenk2 t2 on t1.tenthous = t2.tenthous and t2.thousand = 0
union all
(values(1)) limit 1;

I get:

->  Nested Loop  (cost=0.00..2415.03 rows=10 width=4)

and with the join order reversed, I get:

 ->  Nested Loop  (cost=0.00..2440.00 rows=10 width=4)

I'd be more happy using this one as percentage-wise, the cost
difference is much larger.  I don't quite have the will to go through
proving what the actual problem is here. I think [1] already proved
the relpages problem can (or could) happen.

I checked that the t2.thounsand = 0 query still tests the cheap
startup paths in add_paths_to_append_rel() and it does. If I flip
startup_subpaths_valid to false in the debugger, the plan flips to:

                                    QUERY PLAN
-----------------------------------------------------------------------------------
 Limit  (cost=470.12..514.00 rows=1 width=4)
   ->  Append  (cost=470.12..952.79 rows=11 width=4)
         ->  Hash Join  (cost=470.12..952.73 rows=10 width=4)
               Hash Cond: (t1.tenthous = t2.tenthous)
               ->  Seq Scan on tenk1 t1  (cost=0.00..445.00 rows=10000 width=8)
               ->  Hash  (cost=470.00..470.00 rows=10 width=4)
                     ->  Seq Scan on tenk2 t2  (cost=0.00..470.00
rows=10 width=4)
                           Filter: (thousand = 0)
         ->  Result  (cost=0.00..0.01 rows=1 width=4)

So, if nobody has any better ideas, I'm just going to push the " and
t2.thousand = 0" adjustment.

David

[1] https://www.postgresql.org/message-id/4174.1563239552%40sss.pgh.pa.us



Re: make add_paths_to_append_rel aware of startup cost

От
Andy Fan
Дата:
David Rowley <dgrowleyml@gmail.com> writes:

> I'd be more happy using this one as percentage-wise, the cost
> difference is much larger.

+1 for the percentage-wise.
>
> I checked that the t2.thounsand = 0 query still tests the cheap
> startup paths in add_paths_to_append_rel().

I get the same conclusion here. 
>
> So, if nobody has any better ideas, I'm just going to push the " and
> t2.thousand = 0" adjustment.

LGTM.

-- 
Best Regards
Andy Fan




Re: make add_paths_to_append_rel aware of startup cost

От
David Rowley
Дата:
On Fri, 16 Feb 2024 at 01:09, David Rowley <dgrowleyml@gmail.com> wrote:
>
> On Thu, 15 Feb 2024 at 21:42, Andy Fan <zhihuifan1213@163.com> wrote:
> > I found the both plans have the same cost, I can't get the accurate
> > cause of this after some hours research, but it is pretty similar with
> > 7516056c584e3, so I uses a similar strategy to stable it. is it
> > acceptable?
>
> It's pretty hard to say.  I can only guess why this test would be
> flapping like this. I see it's happened before on mylodon, so probably
> not a cosmic ray.  It's not like add_path() chooses a random path when
> the costs are the same, so I wondered if something similar is going on
> here that was going on that led to f03a9ca4. In particular, see [1].

While it's not conclusive proof, the following demonstrates that
relpages dropping by just 1 page causes the join order to change.

regression=# explain
regression-# select t1.unique1 from tenk1 t1
regression-# inner join tenk2 t2 on t1.tenthous = t2.tenthous
regression-# union all
regression-# (values(1)) limit 1;
                                      QUERY PLAN
--------------------------------------------------------------------------------------
 Limit  (cost=0.00..150.08 rows=1 width=4)
   ->  Append  (cost=0.00..1500965.01 rows=10001 width=4)
         ->  Nested Loop  (cost=0.00..1500915.00 rows=10000 width=4)
               Join Filter: (t1.tenthous = t2.tenthous)
               ->  Seq Scan on tenk1 t1  (cost=0.00..445.00 rows=10000 width=8)
               ->  Materialize  (cost=0.00..495.00 rows=10000 width=4)
                     ->  Seq Scan on tenk2 t2  (cost=0.00..445.00
rows=10000 width=4)
         ->  Result  (cost=0.00..0.01 rows=1 width=4)

regression=# update pg_class set relpages=relpages - 1 where relname = 'tenk2';
UPDATE 1
regression=# explain
regression-# select t1.unique1 from tenk1 t1
regression-# inner join tenk2 t2 on t1.tenthous = t2.tenthous
regression-# union all
regression-# (values(1)) limit 1;
                                      QUERY PLAN
--------------------------------------------------------------------------------------
 Limit  (cost=0.00..150.52 rows=1 width=4)
   ->  Append  (cost=0.00..1505315.30 rows=10001 width=4)
         ->  Nested Loop  (cost=0.00..1505265.29 rows=10000 width=4)
               Join Filter: (t1.tenthous = t2.tenthous)
               ->  Seq Scan on tenk2 t2  (cost=0.00..445.29 rows=10029 width=4)
               ->  Materialize  (cost=0.00..495.00 rows=10000 width=8)
                     ->  Seq Scan on tenk1 t1  (cost=0.00..445.00
rows=10000 width=8)
         ->  Result  (cost=0.00..0.01 rows=1 width=4)

I tried this with the proposed changes to the test and the plan did not change.

I've pushed the change now.

David