Обсуждение: strange plan with bitmap heap scan and multiple partial indexes

Поиск
Список
Период
Сортировка

strange plan with bitmap heap scan and multiple partial indexes

От
Tomas Vondra
Дата:
Hi,

While working on the "IOS with partial indexes" patch, I've noticed a 
bit strange plan. It's unrelated to that particular patch (reproducible 
on master), so I'm starting a new thread for it.

To reproduce it, all you have to do is this (on a new cluster, all 
settings on default):
  CREATE TABLE t AS SELECT i AS a, i AS b                      FROM generate_series(1,10000000) s(i);
  CREATE INDEX idx001 ON t (a) where b < 100;  CREATE INDEX idx002 ON t (a) where b < 200;  CREATE INDEX idx003 ON t
(a)where b < 300;
 
  ANALYZE t;
  EXPLAIN SELECT a FROM t WHERE b < 100;
                        QUERY PLAN
-------------------------------------------------------------------- Bitmap Heap Scan on t  (cost=9.01..13.02 rows=1000
width=4)  Recheck Cond: ((b < 300) AND (b < 200))   Filter: (b < 100)   ->  BitmapAnd  (cost=9.01..9.01 rows=1 width=0)
       ->  Bitmap Index Scan on idx003                            (cost=0.00..4.13 rows=1000 width=0)         ->
BitmapIndex Scan on idx002                            (cost=0.00..4.13 rows=1000 width=0)
 

Now, that's strange IMHO. There's a perfectly matching partial index, 
with exactly the same predicate (b<100), but we instead choose the two 
other indexes, and combine them using BitmapAnd. That seems a bit 
strange - choosing one of them over the perfectly matching one would be 
strange too, but why use two and combine them?

Another thing is that this gets fixed by a simple VACUUM on the table.
  EXPLAIN SELECT a FROM t WHERE b < 100;
                             QUERY PLAN
-------------------------------------------------------------------- Index Scan using idx001 on t  (cost=0.14..29.14
rows=1000width=4)
 


Any idea what's going on here? FWIW all this was on 51d0fe5d (July 23).

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: strange plan with bitmap heap scan and multiple partial indexes

От
Andres Freund
Дата:
On 2015-07-11 14:31:25 +0200, Tomas Vondra wrote:
> While working on the "IOS with partial indexes" patch, I've noticed a bit
> strange plan. It's unrelated to that particular patch (reproducible on
> master), so I'm starting a new thread for it.
>
> To reproduce it, all you have to do is this (on a new cluster, all settings
> on default):
>
>   CREATE TABLE t AS SELECT i AS a, i AS b
>                       FROM generate_series(1,10000000) s(i);
>
>   CREATE INDEX idx001 ON t (a) where b < 100;
>   CREATE INDEX idx002 ON t (a) where b < 200;
>   CREATE INDEX idx003 ON t (a) where b < 300;
>
>   ANALYZE t;
>
>   EXPLAIN SELECT a FROM t WHERE b < 100;

>                         QUERY PLAN

It's indeed interesting. Running
ANALYZE t;EXPLAIN SELECT a FROM t WHERE b < 100;
repeatedly switches back and forth between the plans.

Note that
Bitmap Heap Scan on t  (cost=9.01..13.02 rows=1000 width=4)
is actually costed significantly cheaper than
Index Scan using idx001 on t  (cost=0.15..30.32 rows=1000 width=4)
which means yet another wierd thing is that it's not consistently
choosing that plan.


When the index scan plan is chosen you interestingly get the bitmapscan
plan, but at a slightly higher cost:
postgres[32046][1]=# EXPLAIN SELECT a FROM t WHERE b < 100;
┌────────────────────────────────────────────────────────────────────┐
│                             QUERY PLAN                             │
├────────────────────────────────────────────────────────────────────┤
│ Index Scan using idx001 on t  (cost=0.15..30.32 rows=1000 width=4) │
└────────────────────────────────────────────────────────────────────┘
(1 row)

Time: 0.460 ms
postgres[32046][1]=# SET enable_indexscan = false;
SET
Time: 0.119 ms
postgres[32046][1]=# EXPLAIN SELECT a FROM t WHERE b < 100;
┌──────────────────────────────────────────────────────────────────────────────
│                                  QUERY PLAN
│
├──────────────────────────────────────────────────────────────────────────────
│ Bitmap Heap Scan on t  (cost=27.05..31.06 rows=1000 width=4)
│   Recheck Cond: ((b < 300) AND (b < 200))
│   Filter: (b < 100)
│   ->  BitmapAnd  (cost=27.05..27.05 rows=1 width=0)
│         ->  Bitmap Index Scan on idx003  (cost=0.00..13.15 rows=1000 width=0)
│         ->  Bitmap Index Scan on idx002  (cost=0.00..13.15 rows=1000 width=0)
└──────────────────────────────────────────────────────────────────────────────


Looking at the stats is interesting:

postgres[32046][1]=# ANALYZE t;SELECT relpages, reltuples, relallvisible FROM pg_class WHERE relname IN ('t', 'idx003',
'idx002','idx001');EXPLAIN SELECT a FROM t WHERE b < 100;
 
ANALYZE
Time: 123.469 ms
┌──────────┬───────────┬───────────────┐
│ relpages │ reltuples │ relallvisible │
├──────────┼───────────┼───────────────┤
│    44248 │     1e+07 │         44248 │
│        2 │         0 │             0 │
│        2 │       667 │             0 │
│        2 │       667 │             0 │
└──────────┴───────────┴───────────────┘
(4 rows)

Time: 0.405 ms
┌────────────────────────────────────────────────────────────────────┐
│                             QUERY PLAN                             │
├────────────────────────────────────────────────────────────────────┤
│ Index Scan using idx001 on t  (cost=0.12..24.63 rows=1000 width=4) │
└────────────────────────────────────────────────────────────────────┘
(1 row)

Time: 0.269 ms
postgres[32046][1]=# ANALYZE t;SELECT relpages, reltuples, relallvisible FROM pg_class WHERE relname IN ('t', 'idx003',
'idx002','idx001');EXPLAIN SELECT a FROM t WHERE b < 100;
 
ANALYZE
Time: 124.430 ms
┌──────────┬───────────┬───────────────┐
│ relpages │ reltuples │ relallvisible │
├──────────┼───────────┼───────────────┤
│    44248 │     1e+07 │         44248 │
│        2 │         0 │             0 │
│        2 │         0 │             0 │
│        2 │         0 │             0 │
└──────────┴───────────┴───────────────┘
(4 rows)

Time: 0.272 ms
┌──────────────────────────────────────────────────────────────────────────────┐
│                                  QUERY PLAN                                  │
├──────────────────────────────────────────────────────────────────────────────┤
│ Bitmap Heap Scan on t  (cost=9.01..13.02 rows=1000 width=4)                  │
│   Recheck Cond: ((b < 300) AND (b < 200))                                    │
│   Filter: (b < 100)                                                          │
│   ->  BitmapAnd  (cost=9.01..9.01 rows=1 width=0)                            │
│         ->  Bitmap Index Scan on idx003  (cost=0.00..4.13 rows=1000 width=0) │
│         ->  Bitmap Index Scan on idx002  (cost=0.00..4.13 rows=1000 width=0) │
└──────────────────────────────────────────────────────────────────────────────┘
(6 rows)

Time: 0.275 ms
postgres[32046][1]=# ANALYZE t;SELECT relpages, reltuples, relallvisible FROM pg_class WHERE relname IN ('t', 'idx003',
'idx002','idx001');EXPLAIN SELECT a FROM t WHERE b < 100;
 
ANALYZE
Time: 112.592 ms
┌──────────┬─────────────┬───────────────┐
│ relpages │  reltuples  │ relallvisible │
├──────────┼─────────────┼───────────────┤
│    44248 │ 9.99998e+06 │         44248 │
│        2 │           0 │             0 │
│        2 │         334 │             0 │
│        2 │         334 │             0 │
└──────────┴─────────────┴───────────────┘
(4 rows)

Time: 0.330 ms
┌────────────────────────────────────────────────────────────────────┐
│                             QUERY PLAN                             │
├────────────────────────────────────────────────────────────────────┤
│ Index Scan using idx001 on t  (cost=0.12..24.63 rows=1000 width=4) │
└────────────────────────────────────────────────────────────────────┘
(1 row)

Time: 0.320 ms

So this seems to be stats related.



Re: strange plan with bitmap heap scan and multiple partial indexes

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> On 2015-07-11 14:31:25 +0200, Tomas Vondra wrote:
>> While working on the "IOS with partial indexes" patch, I've noticed a bit
>> strange plan. It's unrelated to that particular patch (reproducible on
>> master), so I'm starting a new thread for it.

> It's indeed interesting. Running
> ANALYZE t;EXPLAIN SELECT a FROM t WHERE b < 100;
> repeatedly switches back and forth between the plans.

The issue basically is that ANALYZE is putting quasi-random numbers into
the reltuples estimates for the indexes.  Things seem to be consistently
sane after a VACUUM:

regression=# vacuum t;
VACUUM
regression=# select relname,relpages,reltuples from pg_class where relname in ( 't', 'idx001', 'idx002',
'idx003');relname| relpages |  reltuples  
 
---------+----------+-------------t       |    44248 | 9.99998e+06idx001  |        2 |          99idx002  |        2 |
      199idx003  |        2 |         299
 
(4 rows)

but not so much after ANALYZE:

regression=# analyze t;
ANALYZE
regression=# select relname,relpages,reltuples from pg_class where relname in ( 't', 'idx001', 'idx002',
'idx003');relname| relpages | reltuples 
 
---------+----------+-----------t       |    44248 |     1e+07idx001  |        2 |         0idx002  |        2 |
0idx003  |        2 |         0
 
(4 rows)

I've also seen states like
relname | relpages |  reltuples  
---------+----------+-------------t       |    44248 | 9.99998e+06idx001  |        2 |           0idx002  |        2 |
      334idx003  |        2 |         334
 
(4 rows)

Presumably, this is happening because the numbers of rows actually
satisfying the index predicates are so small that it's a matter of luck
whether any of them are included in ANALYZE's sample.

Given this bad data for the index sizes, it's not totally surprising that
choose_bitmap_and() does something wacko.  I'm not sure whether we should
try to make it smarter, or write this off as "garbage in, garbage out".

Another idea is to not trust any individual ANALYZE's estimate of the
index rowcount so completely.  (I'd thought that the moving-average logic
would get applied to that, but it doesn't seem to be kicking in for some
reason.)

We could probably make this smarter if we were willing to apply the
predicate-proof machinery in more situations; in this example, once we know
that idx001 is applicable, we really should disregard idx002 and idx003
altogether because their predicates are implied by idx001's.  I've always
been hesitant to do that because the cost of checking seemed likely to
greatly outweigh the benefits.  But since Tomas is nosing around in this
territory already, maybe he'd like to investigate that further.
        regards, tom lane



Re: strange plan with bitmap heap scan and multiple partial indexes

От
Tomas Vondra
Дата:
On 07/11/2015 06:32 PM, Tom Lane wrote:
...
>
> Presumably, this is happening because the numbers of rows actually
> satisfying the index predicates are so small that it's a matter of
> luck whether any of them are included in ANALYZE's sample.
>
> Given this bad data for the index sizes, it's not totally surprising
> that choose_bitmap_and() does something wacko. I'm not sure whether
> we should try to make it smarter, or write this off as "garbage in,
> garbage out".

I think we should make it smarter, if possible - while this example is 
somewhat artificial, partial indexes are often used exactly like this, 
i.e. to index only very small subset of data. A good example may be an 
index on "active invoices", i.e. invoices that were yet sorted out. 
There may be a lot of invoices in the table, but only very small 
fraction of them will be active (and thus in the index).

So I don't think is an artificial problem, and we should not write it 
off as "garbage in".

> Another idea is to not trust any individual ANALYZE's estimate of
> the index rowcount so completely. (I'd thought that the
> moving-average logic would get applied to that, but it doesn't seem
> to be kicking in for some reason.)
>
> We could probably make this smarter if we were willing to apply the
> predicate-proof machinery in more situations; in this example, once
> we know that idx001 is applicable, we really should disregard idx002
> and idx003 altogether because their predicates are implied by
> idx001's. I've always been hesitant to do that because the cost of
> checking seemed likely to greatly outweigh the benefits. But since
> Tomas is nosing around in this territory already, maybe he'd like to
>  investigate that further.

I think there are two possible approaches in general - we may improve 
the statistics somehow, or we may start doing the predicate proofing.

I doubt approaching this at the statistics level alone is sufficient, 
because even with statistics target 10k (i.e. the most detailed one), 
the sample is still fixed-size. So there will always exist a combination 
of a sufficiently large data set and selective partial index, causing 
trouble with the sampling.

Moreover, I can't really think of a way to fix this at the statistics 
level. Maybe there's a clever trick guarding against this particular 
issue, but my personal experience is that whenever I used such a smart 
hack, it eventually caused strange issues elsewhere.

So I think the predicate proofing is a better approach, but of course 
the planning cost may be an issue. But maybe we can make this cheaper by 
some clever tricks? For example, given two predicates A and B, it seems 
that if A => B, then selectivity(A) <= selectivity(B). Could we use this 
to skip some of the expensive stuff? We should have the selectivities 
anyway, no?

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: strange plan with bitmap heap scan and multiple partial indexes

От
Tom Lane
Дата:
Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
> So I think the predicate proofing is a better approach, but of course 
> the planning cost may be an issue. But maybe we can make this cheaper by 
> some clever tricks? For example, given two predicates A and B, it seems 
> that if A => B, then selectivity(A) <= selectivity(B). Could we use this 
> to skip some of the expensive stuff? We should have the selectivities 
> anyway, no?

We do.  The existing logic in choose_bitmap_and essentially uses the
selectivity as a heuristic to indicate which partial indexes might have
predicates that imply another index's predicate.  The expectation is that
the former would have selectivity strictly smaller than the latter,
causing the former to be processed first, and then the existing rules
about what indexes can be "added onto" a potential AND combination will
do the trick.

The reason this fails in your example is that if the two indexes have
exactly identical selectivities (due to identical reltuples values),
there's no certainty what order they get sorted in, and the adding-on
rules don't catch the case where the new index would actually imply
the old one rather than vice versa.

Conceivably, we could fix this at relatively small cost in the normal case
by considering predicate proof rules in the sort comparator, and only if
the estimated selectivities are identical.  Sure seems like a kluge
though, and I remain unconvinced that it's really a case that arises that
much in the real world.  The short description of the set of indexes you
showed is "redundantly silly".  Useful sets of indexes would not likely
all have indistinguishably small selectivities.

Perhaps a less klugy answer is to tweak the adding-on rules some more,
but I haven't thought about exactly how.
        regards, tom lane



Re: strange plan with bitmap heap scan and multiple partial indexes

От
Tomas Vondra
Дата:
Hi,

On 07/11/2015 11:40 PM, Tom Lane wrote:
> Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:
>> So I think the predicate proofing is a better approach, but of
>> course the planning cost may be an issue. But maybe we can make
>> this cheaper by some clever tricks? For example, given two
>> predicates A and B, it seems that if A => B, then selectivity(A) <=
>> selectivity(B). Could we use this to skip some of the expensive
>> stuff? We should have the selectivities anyway, no?
>
> We do. The existing logic in choose_bitmap_and essentially uses the
> selectivity as a heuristic to indicate which partial indexes might
> have predicates that imply another index's predicate. The expectation
> is that the former would have selectivity strictly smaller than the
> latter, causing the former to be processed first, and then the
> existing rules about what indexes can be "added onto" a potential AND
> combination will do the trick.
>
> The reason this fails in your example is that if the two indexes
> have exactly identical selectivities (due to identical reltuples
> values), there's no certainty what order they get sorted in, and the
> adding-on rules don't catch the case where the new index would
> actually imply the old one rather than vice versa.

Ah, OK. Thanks for the explanation.

> Conceivably, we could fix this at relatively small cost in the normal
> case by considering predicate proof rules in the sort comparator, and
> only if the estimated selectivities are identical.

Yep, that's what basically I had in mind.

> Sure seems like a kluge though, and I remain unconvinced that it's
> really a case that arises that much in the real world. The short
> description of the set of indexes you showed is "redundantly silly".
> Useful sets of indexes would not likely all have indistinguishably
> small selectivities.

Fair point - the example really is artificial, and was constructed to 
demonstrate planning costs of the extended predicate-proofing for 
partial indexes. And while in reality small partial indexes are quite 
common, they probably use predicates tailored for individual queries 
(and not the nested predicates as in the example).

regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services