Обсуждение: BUG #19007: Planner fails to choose partial index with spurious 'not null'

Поиск
Список
Период
Сортировка

BUG #19007: Planner fails to choose partial index with spurious 'not null'

От
PG Bug reporting form
Дата:
The following bug has been logged on the website:

Bug reference:      19007
Logged by:          Bryan Fox
Email address:      bryfox@gmail.com
PostgreSQL version: 17.5
Operating system:   Linux; macOS
Description:

I'm not sure if this is a bug, but the behavior was unexpected to me and
changed since v16. Documentation doesn't mention this as far as I can see.

This example has a partial index over one column where another column is not
null. The latter column is in fact 'not null' in the schema.

Prior to v17, this index would be used; in v17, the planner will choose a
sequential scan instead.

Of course, this setup is a little silly and easy to remedy. In reality, we
had a more complicated index and the column was nullable; later, someone
made a column 'not null'; later, we upgraded to v17. `last_idx_scan` did
make this easier to spot, though.


-- setup
create table example (id int, value float not null, flag bool not null);

insert into example
    select generate_series(1, 100_000) id, random() value, true flag;

create index new_idx on example using btree (value) where flag is not null;

-- query
explain analyze
select * from example where value < 0.1 and flag is not null;


v17 plan:
                                                 QUERY PLAN
------------------------------------------------------------------------------------------------------------
 Seq Scan on example  (cost=0.00..1887.00 rows=33333 width=13) (actual
time=0.010..5.816 rows=9951 loops=1)
   Filter: (value < '0.1'::double precision)
   Rows Removed by Filter: 90049


v16 plan:
                                                       QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on example  (cost=625.34..1676.92 rows=33167 width=13)
(actual time=1.023..2.710 rows=9979 loops=1)
   Recheck Cond: ((value < '0.1'::double precision) AND (flag IS NOT NULL))
   Heap Blocks: exact=637
   ->  Bitmap Index Scan on new_idx  (cost=0.00..617.04 rows=33167 width=0)
(actual time=0.936..0.937 rows=9979 loops=1)
         Index Cond: (value < '0.1'::double precision)


Re: BUG #19007: Planner fails to choose partial index with spurious 'not null'

От
Tom Lane
Дата:
PG Bug reporting form <noreply@postgresql.org> writes:
> I'm not sure if this is a bug, but the behavior was unexpected to me and
> changed since v16. Documentation doesn't mention this as far as I can see.
> This example has a partial index over one column where another column is not
> null. The latter column is in fact 'not null' in the schema.
> Prior to v17, this index would be used; in v17, the planner will choose a
> sequential scan instead.

Yeah.  What is happening is that the query's "flag is not null"
condition is thrown away as redundant fairly early in planning,
and then it's not available to match to the index's predicate,
so the index doesn't appear to be applicable.

I had thought that e2debb643 would fix this in HEAD, since it moved
the throw-it-away behavior into eval_const_expressions which is also
applied to index predicates.  Testing shows not, which feels like a
bug.  It is not okay that eval_const_expressions' behavior varies
depending on context; we need it to act the same on index
expressions/predicates as on query expressions.  Maybe we should
give up on trying to pre-apply eval_const_expressions to the
relcache's copies of the index expressions, and just do that part
during plancat's collection of the info?

(I'm also quite displeased that e2debb643 didn't nuke
restriction_is_always_true/false.  But that's a separate discussion.)

            regards, tom lane



Re: BUG #19007: Planner fails to choose partial index with spurious 'not null'

От
David Rowley
Дата:
On Mon, 4 Aug 2025 at 05:19, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I had thought that e2debb643 would fix this in HEAD, since it moved
> the throw-it-away behavior into eval_const_expressions which is also
> applied to index predicates.  Testing shows not, which feels like a
> bug.  It is not okay that eval_const_expressions' behavior varies
> depending on context; we need it to act the same on index
> expressions/predicates as on query expressions.  Maybe we should
> give up on trying to pre-apply eval_const_expressions to the
> relcache's copies of the index expressions, and just do that part
> during plancat's collection of the info?

I guess it doesn't because in RelationGetIndexPredicate(), the call to
eval_const_expressions() passes NULL for the PlannerInfo, resulting in
the NOT NULL information not being available during
eval_const_expressions().

Looks like we keep the indpred varnos as 1 during
RelationGetIndexPredicate() and only change them to the actual varno
of the relation after that call inside get_relation_info(), so it
wouldn't be valid to pass the PlannerInfo down without first rewriting
the varnos and it doesn't seem correct to rewrite the varnos in the
relcache code.

Seems like what it would take to make this work is *another* call to
eval_const_expressions() inside get_relation_info() just after the
varnos have been changed. The attached seems to work. However, it
seems a bit annoying to have to call eval_const_expressions() all over
again for such a rare case. I think it only really covers this rather
silly case, which Bryan seems well aware that the fix is to make the
index non-partial again.

David

Вложения

Re: BUG #19007: Planner fails to choose partial index with spurious 'not null'

От
Richard Guo
Дата:
On Mon, Aug 4, 2025 at 1:29 PM David Rowley <dgrowleyml@gmail.com> wrote:
> I guess it doesn't because in RelationGetIndexPredicate(), the call to
> eval_const_expressions() passes NULL for the PlannerInfo, resulting in
> the NOT NULL information not being available during
> eval_const_expressions().

Exactly.  The comment of eval_const_expressions also notes that when
"root" is NULL, NullTest quals will not be reduced.

 * NOTE: "root" can be passed as NULL if the caller never wants to do any
 * Param substitutions nor receive info about inlined functions nor reduce
 * NullTest for Vars to constant true or constant false.

> Looks like we keep the indpred varnos as 1 during
> RelationGetIndexPredicate() and only change them to the actual varno
> of the relation after that call inside get_relation_info(), so it
> wouldn't be valid to pass the PlannerInfo down without first rewriting
> the varnos and it doesn't seem correct to rewrite the varnos in the
> relcache code.

Yeah.  And in addition, the fact that we cache the pg_index.indpred in
the relcache entry means that we cannot rewrite the varnos within
RelationGetIndexPredicate().

> Seems like what it would take to make this work is *another* call to
> eval_const_expressions() inside get_relation_info() just after the
> varnos have been changed.

I wonder if we could move const-simplification and canonicalization of
index predicates out of RelationGetIndexPredicate() and into its
callers, after the varnos have been updated.  That would require
changing every caller, though.


BTW, I wonder if we should also be concerned about index expressions,
in cases like:

create table t (a int, b int not null);
create index on t ((b is not null));
explain select * from t where b is not null;

Thanks
Richard



Re: BUG #19007: Planner fails to choose partial index with spurious 'not null'

От
Richard Guo
Дата:
On Mon, Aug 4, 2025 at 2:19 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> (I'm also quite displeased that e2debb643 didn't nuke
> restriction_is_always_true/false.  But that's a separate discussion.)

Hmm, I tried to do that in e2debb643, but failed, as the commit
message explains:

    Note that this patch does not get rid of restriction_is_always_true
    and restriction_is_always_false.  Removing them would prevent us from
    reducing some IS [NOT] NULL quals that we were previously able to
    reduce, because (a) the self-join elimination may introduce new IS NOT
    NULL quals after constant folding, and (b) if some outer joins are
    converted to inner joins, previously irreducible NullTest quals may
    become reducible.

Thanks
Richard



Re: BUG #19007: Planner fails to choose partial index with spurious 'not null'

От
Richard Guo
Дата:
I've had some time to look at this issue again.  There are several
cases in plancat.c where specific expressions need to be run through
eval_const_expressions because the planner will be comparing them to
similarly-processed qual clauses.  This includes constraint
expressions, statistics expressions, index expressions and index
predicates -- as highlighted by this bug report.

For a constraint expression, we run it through eval_const_expressions
before fixing its Vars to have the correct varno.  This is problematic
because eval_const_expressions reduces NullTest quals based on varno,
so we must ensure the Vars have the correct varnos before calling it.
I think we can fix it by reordering the calls to eval_const_expressions
and ChangeVarNodes in get_relation_constraints().

For statistics expressions, we have the same issue as with constraint
expressions, plus we don't pass a valid root to eval_const_expressions.
I think we can fix this by adding "root" as a new parameter to
get_relation_statistics() and passing it to eval_const_expressions.

For index expressions and index predicates, however, the situation is
more complex, because we cache the constant-simplified expressions in
the relcache entry.  This means we can't fix the Vars before applying
eval_const_expressions to them.  One possible solution is to run
eval_const_expressions twice: first with "root" set to NULL, before
fixing the Vars, and then again with a valid "root" after the Vars
have been fixed.  Please see the attached draft patch.

(This is basically what David's proposed patch does.  However, I don't
think we should limit ourselves to the case where info->indpred
contains only a single reducible NullTest qual, since there may be
more than one.)

However, running eval_const_expressions twice for index expressions and
index predicates looks ugly, and one could argue that it might have a
performance impact.

BTW, all of these solutions are only possible after e2debb643, which
is only in master.  In v17 and v18, we don't have a way to reduce
NullTest quals for these expressions, since that reduction doesn't
occur during const-simplification.  I don't know what we should do for
v17 and v18.

Thoughts?

Thanks
Richard

Вложения

Re: BUG #19007: Planner fails to choose partial index with spurious 'not null'

От
Richard Guo
Дата:
On Wed, Aug 20, 2025 at 6:37 PM Richard Guo <guofenglinux@gmail.com> wrote:
> I've had some time to look at this issue again.  There are several
> cases in plancat.c where specific expressions need to be run through
> eval_const_expressions because the planner will be comparing them to
> similarly-processed qual clauses.  This includes constraint
> expressions, statistics expressions, index expressions and index
> predicates -- as highlighted by this bug report.

I've split the patch into two.  0001 fixes const-simplification for
constraint expressions and statistics expressions by ensuring that
Vars are updated to have the correct varno before
const-simplification, and that a valid root is passed to
eval_const_expressions.  0002 fixes const-simplification for index
expressions and predicate by running eval_const_expressions a second
time after the Vars have been fixed and with a valid root.

I think 0001 is in pretty good shape.  I'm concerned that in 0002 we
have to run eval_const_expressions twice.  I once considered caching
the untransformed index expressions and predicate, and running
const-simplification outside of relcache.c.  However, that doesn't
seem like a better solution, since we would need to re-run
const-simplification every time we fetch the index expressions and
predicate.  Any suggestions?

Thanks
Richard

Вложения

Re: BUG #19007: Planner fails to choose partial index with spurious 'not null'

От
Richard Guo
Дата:
On Fri, Aug 22, 2025 at 4:25 PM Richard Guo <guofenglinux@gmail.com> wrote:
> I think 0001 is in pretty good shape.  I'm concerned that in 0002 we
> have to run eval_const_expressions twice.  I once considered caching
> the untransformed index expressions and predicate, and running
> const-simplification outside of relcache.c.  However, that doesn't
> seem like a better solution, since we would need to re-run
> const-simplification every time we fetch the index expressions and
> predicate.  Any suggestions?

0001 has been pushed; here is a rebase of 0002.

I think the additional call to eval_const_expressions() is acceptable
performance-wise, because 1) it only runs when index expressions or
predicates are present, and 2) it's relatively cheap when run on small
expression trees, which is typically the case for index expressions
and predicates.  In return, in cases such as the reported, it enables
the planner to match and use partial indexes, which can be a big win
in execution.

- Richard

Вложения

Re: BUG #19007: Planner fails to choose partial index with spurious 'not null'

От
Richard Guo
Дата:
On Mon, Sep 1, 2025 at 3:00 PM Richard Guo <guofenglinux@gmail.com> wrote:
> 0001 has been pushed; here is a rebase of 0002.
>
> I think the additional call to eval_const_expressions() is acceptable
> performance-wise, because 1) it only runs when index expressions or
> predicates are present, and 2) it's relatively cheap when run on small
> expression trees, which is typically the case for index expressions
> and predicates.  In return, in cases such as the reported, it enables
> the planner to match and use partial indexes, which can be a big win
> in execution.

Here is an updated version of the patch.  In the commit message, I've
explained why I think the additional call to eval_const_expressions is
not a performance concern.  I've also added a test case to validate
that NullTest quals in index expressions can now be reduced.

One concern I have is that this fix can only be applied to master;
there's no way to backpatch it to v17 or v18.  This means that the
issue won't exist in v16 or earlier, nor in v19 and later, but it will
persist in v17 and v18.  I have no idea what we should do for v17 and
v18.

- Richard

Вложения