Обсуждение: Improve choose_custom_plan for initial partition prune case

Поиск

Список

Период

Сортировка

Improve choose_custom_plan for initial partition prune case

От

Andy Fan

Дата:

01 октября 2020 г., 19:04:34

Given the plan example:

CREATE TABLE measurement (
city_id int not null,
logdate date not null,
peaktemp int,
unitsales int
) PARTITION BY RANGE (logdate);

CREATE TABLE measurement_y2006m02 PARTITION OF measurement
FOR VALUES FROM ('2006-02-01') TO ('2006-03-01');

CREATE TABLE measurement_y2006m03 PARTITION OF measurement
FOR VALUES FROM ('2006-03-01') TO ('2006-04-01');

prepare s as select * from measurement where logdate = $1;

execute s('2006-02-01').

The generic plan will probably not be chosen because it doesn't reduce the cost

which can be reduced at initial_prune while the custom plan reduces such cost

at planning time. which makes the cost comparison not fair. I'm thinking if we can

get an estimated cost reduction of initial_prunne for generic plan based on the

partition pruned at plan time from custom plan and then reducing

such costs from the generic plan. I just went through the related code but

didn't write anything now. I'd like to see if this is a correct direction to go.

Best Regards

Andy Fan

Re: Improve choose_custom_plan for initial partition prune case

От

Amit Langote

Дата:

02 октября 2020 г., 11:20:52

Hi Andy,

On Fri, Oct 2, 2020 at 1:04 AM Andy Fan <zhihui.fan1213@gmail.com> wrote:
>
> Given the plan example:
>
> CREATE TABLE measurement (
>     city_id         int not null,
>     logdate         date not null,
>     peaktemp        int,
>     unitsales       int
> ) PARTITION BY RANGE (logdate);
>
> CREATE TABLE measurement_y2006m02 PARTITION OF measurement
>     FOR VALUES FROM ('2006-02-01') TO ('2006-03-01');
>
> CREATE TABLE measurement_y2006m03 PARTITION OF measurement
>     FOR VALUES FROM ('2006-03-01') TO ('2006-04-01');
>
> prepare s as select * from measurement where logdate = $1;
> execute s('2006-02-01').
>
> The generic plan will probably not be chosen because it doesn't reduce the cost
> which can be reduced at initial_prune while the custom plan reduces such cost
> at  planning time. which makes the cost comparison not fair.

I agree that there is something to be done here.  Actually, I think we
should try to find a solution that will allow us to consider not just
"initial" pruning, but also "execution-time" pruning.   The latter
will allow a nested loop join whose inner side scans a partitioned
table using a parameterized scan on the partition key to be favored
over other join plans, because that parameterized scan can use
execution-time pruning which can make the inner scan very cheap.

>  I'm thinking if we can
> get an estimated cost reduction of initial_prunne for generic plan based on the
> partition pruned at plan time from custom plan and then reducing
> such costs from the generic plan.  I just went through the related code but
> didn't write anything now.  I'd like to see if this is a correct direction to go.

That's an interesting idea, that is, to try to do this totally outside
the planner.  When I was thinking about this a little while ago, I was
trying to find a way to adjust the cost of the plan in the planner
itself by looking at the runtime pruning info in the nodes that
support it, that is, Append, MergeAppend.  Actually, such an approach
had also come up in the original run-time pruning discussion [1].

--
Amit Langote
EDB: http://www.enterprisedb.com

[1] https://www.postgresql.org/message-id/CA%2BTgmoZHYoAL4HYwnGO25B8CxCB%2BvNMdf%2B7rbUzYykR4sU9yUA%40mail.gmail.com

Re: Improve choose_custom_plan for initial partition prune case

От

Andy Fan

Дата:

03 октября 2020 г., 05:05:59

Hi Amit:

Very glad to see your comment!

On Fri, Oct 2, 2020 at 4:21 PM Amit Langote <amitlangote09@gmail.com> wrote:

Hi Andy,

On Fri, Oct 2, 2020 at 1:04 AM Andy Fan <zhihui.fan1213@gmail.com> wrote:
>
> Given the plan example:
>
> CREATE TABLE measurement (
> city_id int not null,
> logdate date not null,
> peaktemp int,
> unitsales int
> ) PARTITION BY RANGE (logdate);
>
> CREATE TABLE measurement_y2006m02 PARTITION OF measurement
> FOR VALUES FROM ('2006-02-01') TO ('2006-03-01');
>
> CREATE TABLE measurement_y2006m03 PARTITION OF measurement
> FOR VALUES FROM ('2006-03-01') TO ('2006-04-01');
>
> prepare s as select * from measurement where logdate = $1;
> execute s('2006-02-01').
>
> The generic plan will probably not be chosen because it doesn't reduce the cost
> which can be reduced at initial_prune while the custom plan reduces such cost
> at planning time. which makes the cost comparison not fair.

I agree that there is something to be done here. Actually, I think we
should try to find a solution that will allow us to consider not just
"initial" pruning, but also "execution-time" pruning. The latter
will allow a nested loop join whose inner side scans a partitioned
table using a parameterized scan on the partition key to be favored
over other join plans, because that parameterized scan can use
execution-time pruning which can make the inner scan very cheap.

This looks like to resolve another important issue of partition prune, which

may happen at planning time totally (no generic plan or custom plan involved).

for example between choosing a Nest Loop plan which can use

some run-time partition prune and hash join which can't. I "repeat" your idea

just to make sure I understand you correctly.

> I'm thinking if we can
> get an estimated cost reduction of initial_prunne for generic plan based on the
> partition pruned at plan time from custom plan and then reducing
> such costs from the generic plan. I just went through the related code but
> didn't write anything now. I'd like to see if this is a correct direction to go.

That's an interesting idea, that is, to try to do this totally outside
the planner. When I was thinking about this a little while ago, I was
trying to find a way to adjust the cost of the plan in the planner
itself by looking at the runtime pruning info in the nodes that
support it, that is, Append, MergeAppend. Actually, such an approach
had also come up in the original run-time pruning discussion [1].

Thank you for your comments. Looks like your approach can be helpful

for the both cases, and I did think a bit for that as well, However, that looks

complex (for me) AND I am prefer to guess how many partitions can be

pruned with real data even it is the real data in the past (I assume that

will not cause too much difference in practice).

I'm not sure if I should treat Robert's comments as an opposed idea[1] ,

but I think there are some little differences. I'd like to implement my idea

soon, and I'm glad to see any opposed idea at any time, of course the sooner

the better:)

[1] https://www.postgresql.org/message-id/CA%2BTgmoZv8sd9cKyYtHwmd_13%2BBAjkVKo%3DECe7G98tBK5Ejwatw%40mail.gmail.com

Best Regards

Andy Fan

Re: Improve choose_custom_plan for initial partition prune case

От

Ashutosh Bapat

Дата:

05 октября 2020 г., 16:27:09

On Thu, Oct 1, 2020 at 9:34 PM Andy Fan <zhihui.fan1213@gmail.com> wrote:
>
> Given the plan example:
>
> CREATE TABLE measurement (
>     city_id         int not null,
>     logdate         date not null,
>     peaktemp        int,
>     unitsales       int
> ) PARTITION BY RANGE (logdate);
>
> CREATE TABLE measurement_y2006m02 PARTITION OF measurement
>     FOR VALUES FROM ('2006-02-01') TO ('2006-03-01');
>
> CREATE TABLE measurement_y2006m03 PARTITION OF measurement
>     FOR VALUES FROM ('2006-03-01') TO ('2006-04-01');
>
> prepare s as select * from measurement where logdate = $1;
> execute s('2006-02-01').
>
> The generic plan will probably not be chosen because it doesn't reduce the cost
> which can be reduced at initial_prune while the custom plan reduces such cost
> at  planning time. which makes the cost comparison not fair.  I'm thinking if we can
> get an estimated cost reduction of initial_prunne for generic plan based on the
> partition pruned at plan time from custom plan and then reducing
> such costs from the generic plan.  I just went through the related code but
> didn't write anything now.  I'd like to see if this is a correct direction to go.

I think the end result will depend upon the value passed to the first
few executions of the prepared plan. If they happen to have estimates
similar or higher compared to the generic plan, generic plan will be
chosen later. But if they happen to be way lesser than the generic
plan's estimates, a custom plan might be chosen. What happens when we
execute plans with values that have estimates similar to the generic
plan later when we moderate generic plan costs based on the custom
plans?

If the table has good distribution of a partition key, which also
results in good distribution of data across partitions, generic plan
cost will be similar to the custom plan costs. If not that's something
we want to fix. But if there's a large data skew, probably letting the
custom plan always win is better. [1] talks about generic plan being
not chosen just because it has higher cost even though its shape is
similar to a custom plan. Leveraging that idea might be a good option.
If the custom plans produced have shapes similar to the generic plan,
stop producing those.

[1] https://www.postgresql.org/message-id/CA%2BTgmoZHYoAL4HYwnGO25B8CxCB%2BvNMdf%2B7rbUzYykR4sU9yUA%40mail.gmail.com

-- 
Best Wishes,
Ashutosh Bapat

Re: Improve choose_custom_plan for initial partition prune case

От

Andy Fan

Дата:

07 октября 2020 г., 08:50:42

Hi Ashutosh:

Thanks for coming.

On Mon, Oct 5, 2020 at 9:27 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote:

On Thu, Oct 1, 2020 at 9:34 PM Andy Fan <zhihui.fan1213@gmail.com> wrote:
>
> Given the plan example:
>
> CREATE TABLE measurement (
> city_id int not null,
> logdate date not null,
> peaktemp int,
> unitsales int
> ) PARTITION BY RANGE (logdate);
>
> CREATE TABLE measurement_y2006m02 PARTITION OF measurement
> FOR VALUES FROM ('2006-02-01') TO ('2006-03-01');
>
> CREATE TABLE measurement_y2006m03 PARTITION OF measurement
> FOR VALUES FROM ('2006-03-01') TO ('2006-04-01');
>
> prepare s as select * from measurement where logdate = $1;
> execute s('2006-02-01').
>
> The generic plan will probably not be chosen because it doesn't reduce the cost
> which can be reduced at initial_prune while the custom plan reduces such cost
> at planning time. which makes the cost comparison not fair. I'm thinking if we can
> get an estimated cost reduction of initial_prunne for generic plan based on the
> partition pruned at plan time from custom plan and then reducing
> such costs from the generic plan. I just went through the related code but
> didn't write anything now. I'd like to see if this is a correct direction to go.

What happens when we
execute plans with values that have estimates similar to the generic
plan later when we moderate generic plan costs based on the custom
plans?

The example at the beginning of this thread, I used the exact same values

every time, the custom plan will be chosen all the time, which is bad,

The main reason is the custom plan knows the exact value in Execute

message, so it run plan time partition prune, then the total cost is low, however

for the generic plan the partition prune happens at Runtime initial_partition prune

stage, so the cost of the partitions which can be pruned at that stage is still

included the total cost, so generic plans can't be chosen. that would be the

thing I want to fix.

If the table has good distribution of a partition key, which also
results in good distribution of data across partitions, generic plan
cost will be similar to the custom plan costs. If not that's something
we want to fix.

[1] talks about generic plan being
not chosen just because it has higher cost even though its shape is
similar to a custom plan. Leveraging that idea might be a good option.
If the custom plans produced have shapes similar to the generic plan,
stop producing those.

If I understand you correctly, the issue I want to fix here matches this goal.

Best Regards

Andy Fan

Re: Improve choose_custom_plan for initial partition prune case

От

Andy Fan

Дата:

07 октября 2020 г., 09:39:11

On Sat, Oct 3, 2020 at 10:05 AM Andy Fan <zhihui.fan1213@gmail.com> wrote:

Hi Amit:

Very glad to see your comment!

On Fri, Oct 2, 2020 at 4:21 PM Amit Langote <amitlangote09@gmail.com> wrote:
Hi Andy,

On Fri, Oct 2, 2020 at 1:04 AM Andy Fan <zhihui.fan1213@gmail.com> wrote:
>
> Given the plan example:
>
> CREATE TABLE measurement (
> city_id int not null,
> logdate date not null,
> peaktemp int,
> unitsales int
> ) PARTITION BY RANGE (logdate);
>
> CREATE TABLE measurement_y2006m02 PARTITION OF measurement
> FOR VALUES FROM ('2006-02-01') TO ('2006-03-01');
>
> CREATE TABLE measurement_y2006m03 PARTITION OF measurement
> FOR VALUES FROM ('2006-03-01') TO ('2006-04-01');
>
> prepare s as select * from measurement where logdate = $1;
> execute s('2006-02-01').
>
> The generic plan will probably not be chosen because it doesn't reduce the cost
> which can be reduced at initial_prune while the custom plan reduces such cost
> at planning time. which makes the cost comparison not fair.

I agree that there is something to be done here. Actually, I think we
should try to find a solution that will allow us to consider not just
"initial" pruning, but also "execution-time" pruning. The latter
will allow a nested loop join whose inner side scans a partitioned
table using a parameterized scan on the partition key to be favored
over other join plans, because that parameterized scan can use
execution-time pruning which can make the inner scan very cheap.

This looks like to resolve another important issue of partition prune, which
may happen at planning time totally (no generic plan or custom plan involved).
for example between choosing a Nest Loop plan which can use
some run-time partition prune and hash join which can't. I "repeat" your idea
just to make sure I understand you correctly.

> I'm thinking if we can
> get an estimated cost reduction of initial_prunne for generic plan based on the
> partition pruned at plan time from custom plan and then reducing
> such costs from the generic plan. I just went through the related code but
> didn't write anything now. I'd like to see if this is a correct direction to go.

That's an interesting idea, that is, to try to do this totally outside
the planner. When I was thinking about this a little while ago, I was
trying to find a way to adjust the cost of the plan in the planner
itself by looking at the runtime pruning info in the nodes that
support it, that is, Append, MergeAppend. Actually, such an approach
had also come up in the original run-time pruning discussion [1].

Thank you for your comments. Looks like your approach can be helpful
for the both cases, and I did think a bit for that as well, However, that looks
complex (for me) AND I am prefer to guess how many partitions can be
pruned with real data even it is the real data in the past (I assume that
will not cause too much difference in practice).

I'm not sure if I should treat Robert's comments as an opposed idea[1] ,
but I think there are some little differences. I'd like to implement my idea
soon, and I'm glad to see any opposed idea at any time, of course the sooner
the better:)

After some day's coding in this direction, I find a very upsetting situation.

The main idea here is if we have N partition survived for custom plan after

plan time partition prune, then I assume we can get the similar partition

survived for generic plan after initial partition prune. Then we should reduce

such costs from generic plans.

However the hard part in implementation is we can't associate the Append

node in the generic plan with the survived partition information from the

custom plan even Append node has apprelids in it.

1. Associate them with rtindex. The final rtindex is impacted by glob->finalrtable,

however the glob->finalrtable includes the child partitions, we will get less

finalrtable in the custom plan so rtindex can't be matched.

2. Associate them with RelationOid, and we can record such information in the

Append node as well. The bad part is the same relation Oid may appear multiple

times in a query. for example: SELECT .. FROM p p1, p p2 where p1.partkey1 = $1
AND p2.partkey2 = $2;

So any hint on this will be appreciated..

Best Regards

Andy Fan

Re: Improve choose_custom_plan for initial partition prune case

От

Andy Fan

Дата:

07 октября 2020 г., 09:43:54

2. Associate them with RelationOid, and we can record such information in the
Append node as well. The bad part is the same relation Oid may appear multiple
times in a query. for example: SELECT .. FROM p p1, p p2 where p1.partkey1 = $1
AND p2.partkey2 = $2;

I just came up with a new idea. Since this situation should be rare, we can just come back

to our original method (totally ignore the cost reduction) or use the average number. Fixing

the 99% cases would be a big winner as well IMO.

So any hint on this will be appreciated..

--
Best Regards
Andy Fan

Best Regards

Andy Fan

Re: Improve choose_custom_plan for initial partition prune case

От

Andy Fan

Дата:

07 октября 2020 г., 11:39:26

On Wed, Oct 7, 2020 at 2:43 PM Andy Fan <zhihui.fan1213@gmail.com> wrote:

2. Associate them with RelationOid, and we can record such information in the
Append node as well. The bad part is the same relation Oid may appear multiple
times in a query. for example: SELECT .. FROM p p1, p p2 where p1.partkey1 = $1
AND p2.partkey2 = $2;

I just came up with a new idea. Since this situation should be rare, we can just come back
to our original method (totally ignore the cost reduction) or use the average number. Fixing
the 99% cases would be a big winner as well IMO.

I just uploaded a runnable patch for this idea, but it looks like the design is wrong

at the beginning. for example:

Nest Loop:

Append

p_1

p_2

inner

The patch only reduces the cost of the Append node, but in fact, since the loop count

of the inner plan is reduced as well, such cost should be reduced as well. However even

we can reduce the cost for joining, that is not a smart solution as well, since

the generic plan itself might be wrong as well at the beginning.

Best Regards

Andy Fan

Вложения

v1-0001-Reduce-some-generic-plan-cost-by-adjusting-the-Ap.patch

Re: Improve choose_custom_plan for initial partition prune case

От

Ashutosh Bapat

Дата:

07 октября 2020 г., 14:39:43

On Wed, Oct 7, 2020 at 11:20 AM Andy Fan <zhihui.fan1213@gmail.com> wrote:
>
> Hi Ashutosh:
>
> Thanks for coming.
>
> On Mon, Oct 5, 2020 at 9:27 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote:
>>
>> On Thu, Oct 1, 2020 at 9:34 PM Andy Fan <zhihui.fan1213@gmail.com> wrote:
>> >
>> > Given the plan example:
>> >
>> > CREATE TABLE measurement (
>> >     city_id         int not null,
>> >     logdate         date not null,
>> >     peaktemp        int,
>> >     unitsales       int
>> > ) PARTITION BY RANGE (logdate);
>> >
>> > CREATE TABLE measurement_y2006m02 PARTITION OF measurement
>> >     FOR VALUES FROM ('2006-02-01') TO ('2006-03-01');
>> >
>> > CREATE TABLE measurement_y2006m03 PARTITION OF measurement
>> >     FOR VALUES FROM ('2006-03-01') TO ('2006-04-01');
>> >
>> > prepare s as select * from measurement where logdate = $1;
>> > execute s('2006-02-01').
>> >
>> > The generic plan will probably not be chosen because it doesn't reduce the cost
>> > which can be reduced at initial_prune while the custom plan reduces such cost
>> > at  planning time. which makes the cost comparison not fair.  I'm thinking if we can
>> > get an estimated cost reduction of initial_prunne for generic plan based on the
>> > partition pruned at plan time from custom plan and then reducing
>> > such costs from the generic plan.  I just went through the related code but
>> > didn't write anything now.  I'd like to see if this is a correct direction to go.
>>
>> What happens when we
>> execute plans with values that have estimates similar to the generic
>> plan later when we moderate generic plan costs based on the custom
>> plans?
>>
>
> The example at the beginning of this thread,  I used the exact same values
> every time, the custom plan will be chosen all the time, which is bad,
> The main reason is the custom plan knows the exact value in Execute
> message, so it run plan time partition prune, then the total cost is low, however
> for the generic plan the partition prune happens at Runtime initial_partition prune
> stage,  so the cost of the partitions which can be pruned at that stage is still
> included the total cost,  so generic plans can't be chosen. that would be the
> thing I want to fix.

Something is wrong in the generic plan costing then. IIUC, the
selectivity estimate for only a single partition should come out >= 1.
For all the other partitions, it should be 1 and that too because we
clamp the row counts. So the final costs for generic and custom plans
shouldn't be far off unless there's large deviation in the selectivity
of a partition key. I am assuming that there's an equality condition
on a partition key. That's what I meant by the paragraph below.

>
>> If the table has good distribution of a partition key, which also
>> results in good distribution of data across partitions, generic plan
>> cost will be similar to the custom plan costs. If not that's something
>> we want to fix.

Can you please investigate on these lines?

-- 
Best Wishes,
Ashutosh Bapat

Re: Improve choose_custom_plan for initial partition prune case

От

Amit Langote

Дата:

07 октября 2020 г., 15:40:49

Hi Ashutosh,

On Wed, Oct 7, 2020 at 8:40 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
> On Wed, Oct 7, 2020 at 11:20 AM Andy Fan <zhihui.fan1213@gmail.com> wrote:
> > On Mon, Oct 5, 2020 at 9:27 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote:
> >> What happens when we
> >> execute plans with values that have estimates similar to the generic
> >> plan later when we moderate generic plan costs based on the custom
> >> plans?
> >>
> >
> > The example at the beginning of this thread,  I used the exact same values
> > every time, the custom plan will be chosen all the time, which is bad,
> > The main reason is the custom plan knows the exact value in Execute
> > message, so it run plan time partition prune, then the total cost is low, however
> > for the generic plan the partition prune happens at Runtime initial_partition prune
> > stage,  so the cost of the partitions which can be pruned at that stage is still
> > included the total cost,  so generic plans can't be chosen. that would be the
> > thing I want to fix.
>
> Something is wrong in the generic plan costing then.

That's right.

> IIUC, the
> selectivity estimate for only a single partition should come out >= 1.
> For all the other partitions, it should be 1 and that too because we
> clamp the row counts. So the final costs for generic and custom plans
> shouldn't be far off unless there's large deviation in the selectivity
> of a partition key. I am assuming that there's an equality condition
> on a partition key. That's what I meant by the paragraph below.
> >
> >> If the table has good distribution of a partition key, which also
> >> results in good distribution of data across partitions, generic plan
> >> cost will be similar to the custom plan costs. If not that's something
> >> we want to fix.
>
> Can you please investigate on these lines?

The planner currently costs a generic plan assuming that *all*
partitions will be scanned, which is not the case, because run-time
pruning will ensure that scan nodes of all but one partition will be
discarded, assuming a simple query like `select * from parted_table
where a = $1`.  With N partitions, a generic plan for such a query is
N times as expensive as a custom plan, so will never win.

For example:

create table bar (a int, b int, c int) partition by range (a);
create table bar_1 partition of bar for values from (1) to (100001);
create table bar_2 partition of bar for values from (100001) to (200001);
create table bar_3 partition of bar for values from (200001) to (300001);
create table bar_4 partition of bar for values from (300001) to (400001);
create table bar_5 partition of bar for values from (400001) to (500001);
create table bar_6 partition of bar for values from (500001) to (600001);
create table bar_7 partition of bar for values from (600001) to (700001);
create table bar_8 partition of bar for values from (700001) to (800001);
create table bar_9 partition of bar for values from (800001) to (900001);
create table bar_10 partition of bar for values from (900001) to (1000001);
insert into bar select generate_series(1, 1000000);
create index on bar (a);
analyze bar;

set plan_cache_mode to force_custom_plan ;
prepare q as select * from bar where a = $1;
explain execute q (1);
                                  QUERY PLAN
------------------------------------------------------------------------------
 Index Scan using bar_1_a_idx on bar_1 bar  (cost=0.29..8.31 rows=1 width=12)
   Index Cond: (a = 1)
(2 rows)

deallocate q;
set plan_cache_mode to force_generic_plan ;
prepare q as select * from bar where a = $1;
explain execute q (1);
                                   QUERY PLAN
--------------------------------------------------------------------------------
 Append  (cost=0.29..83.15 rows=10 width=12)
   Subplans Removed: 9
   ->  Index Scan using bar_1_a_idx on bar_1  (cost=0.29..8.31 rows=1 width=12)
         Index Cond: (a = $1)
(4 rows)

As you can see, the cost of the generic plan is 83.13 which is 8.31 *
10, even though only one partition is actually scanned, so the actual
cost is the same as a custom plan.  The planner fails to consider that
the those 9 subplans will be removed during execution.

-- 
Amit Langote
EDB: http://www.enterprisedb.com

Re: Improve choose_custom_plan for initial partition prune case

От

Andy Fan

Дата:

07 октября 2020 г., 16:05:07

Thank you Amit and Ashutosh for your reply!

On Wed, Oct 7, 2020 at 8:41 PM Amit Langote <amitlangote09@gmail.com> wrote:

Hi Ashutosh,

On Wed, Oct 7, 2020 at 8:40 PM Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
> On Wed, Oct 7, 2020 at 11:20 AM Andy Fan <zhihui.fan1213@gmail.com> wrote:
> > On Mon, Oct 5, 2020 at 9:27 PM Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> wrote:
> >> What happens when we
> >> execute plans with values that have estimates similar to the generic
> >> plan later when we moderate generic plan costs based on the custom
> >> plans?
> >>
> >
> > The example at the beginning of this thread, I used the exact same values
> > every time, the custom plan will be chosen all the time, which is bad,
> > The main reason is the custom plan knows the exact value in Execute
> > message, so it run plan time partition prune, then the total cost is low, however
> > for the generic plan the partition prune happens at Runtime initial_partition prune
> > stage, so the cost of the partitions which can be pruned at that stage is still
> > included the total cost, so generic plans can't be chosen. that would be the
> > thing I want to fix.
>
> Something is wrong in the generic plan costing then.

That's right.

> IIUC, the
> selectivity estimate for only a single partition should come out >= 1.
> For all the other partitions, it should be 1 and that too because we
> clamp the row counts.

If the planner knows the exact parameter, the planner can estimate

other partitions as 1 row (actually 0 row). However if it doesn't know that,

it will be rows / ndistinctVals for "partkey = $1" case.

So the final costs for generic and custom plans
> shouldn't be far off unless there's large deviation in the selectivity
> of a partition key. I am assuming that there's an equality condition
> on a partition key. That's what I meant by the paragraph below.
> >
> >> If the table has good distribution of a partition key, which also
> >> results in good distribution of data across partitions, generic plan
> >> cost will be similar to the custom plan costs. If not that's something
> >> we want to fix.
>
> Can you please investigate on these lines?

The planner currently costs a generic plan assuming that *all*
partitions will be scanned, which is not the case, because run-time
pruning will ensure that scan nodes of all but one partition will be
discarded, assuming a simple query like `select * from parted_table
where a = $1`. With N partitions, a generic plan for such a query is
N times as expensive as a custom plan, so will never win.

For example:

create table bar (a int, b int, c int) partition by range (a);
create table bar_1 partition of bar for values from (1) to (100001);
create table bar_2 partition of bar for values from (100001) to (200001);
create table bar_3 partition of bar for values from (200001) to (300001);
create table bar_4 partition of bar for values from (300001) to (400001);
create table bar_5 partition of bar for values from (400001) to (500001);
create table bar_6 partition of bar for values from (500001) to (600001);
create table bar_7 partition of bar for values from (600001) to (700001);
create table bar_8 partition of bar for values from (700001) to (800001);
create table bar_9 partition of bar for values from (800001) to (900001);
create table bar_10 partition of bar for values from (900001) to (1000001);
insert into bar select generate_series(1, 1000000);
create index on bar (a);
analyze bar;

set plan_cache_mode to force_custom_plan ;
prepare q as select * from bar where a = $1;
explain execute q (1);
QUERY PLAN
------------------------------------------------------------------------------
Index Scan using bar_1_a_idx on bar_1 bar (cost=0.29..8.31 rows=1 width=12)
Index Cond: (a = 1)
(2 rows)

deallocate q;
set plan_cache_mode to force_generic_plan ;
prepare q as select * from bar where a = $1;
explain execute q (1);
QUERY PLAN
--------------------------------------------------------------------------------
Append (cost=0.29..83.15 rows=10 width=12)
Subplans Removed: 9
-> Index Scan using bar_1_a_idx on bar_1 (cost=0.29..8.31 rows=1 width=12)
Index Cond: (a = $1)
(4 rows)

As you can see, the cost of the generic plan is 83.13 which is 8.31 *
10, even though only one partition is actually scanned, so the actual
cost is the same as a custom plan. The planner fails to consider that
the those 9 subplans will be removed during execution.

--
Amit Langote
EDB: http://www.enterprisedb.com

I also had some replies at the original runtime partition prune thread [1],

I am very sorry that I mess up this topic with 2 threads, I have some new

understanding about this (might be old ideas for others), I'd like to

continue this discussion in that thread and add you to the cc list there.

Thanks!

[1] https://www.postgresql.org/message-id/CAKU4AWrzi7f1Y1J8q0xWO8fXOwf-BtdG%2BM9_7bVPnQyd5cLS0Q%40mail.gmail.com

Best Regards

Andy Fan

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Improve choose_custom_plan for initial partition prune case

Вложения