Re: BUG #6278: Index scans on '>' condition on field with many NULLS

Поиск
Список
Период
Сортировка
От Robert Haas
Тема Re: BUG #6278: Index scans on '>' condition on field with many NULLS
Дата
Msg-id CA+TgmoaF9uU0BGgLsje8Mz0AdgJ1=1Znyqbq3xewvz-S0dEHYg@mail.gmail.com
обсуждение исходный текст
Ответ на BUG #6278: Index scans on '>' condition on field with many NULLS  ("Maksym Boguk" <maxim.boguk@gmail.com>)
Ответы Re: BUG #6278: Index scans on '>' condition on field with many NULLS  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-bugs
On Sun, Oct 30, 2011 at 11:39 PM, Maksym Boguk <maxim.boguk@gmail.com> wrot=
e:
> However very selective index scan on '>' condition can work pretty
> inefficient on column with many nulls.
> (in the same time '<' work well).
>
> Seems index scan on '>' condition going through all nulls in index.
>
> Test case (tested on 8.4 and 9.0 with same effect):
>
> postgres=3D# =A0CREATE table test as select (case when random()>0.1 then =
NULL
> else random() end) as value from generate_series(1,10000000);
> SELECT 10000000
> postgres=3D# CREATE INDEX test_value_key on test(value);
> CREATE INDEX
> postgres=3D# SELECT count(*) from test;
> =A0count
> ----------
> =A010000000
> (1 row)
>
> postgres=3D# VACUUM ANALYZE test;
> VACUUM
>
> postgres=3D# EXPLAIN ANALYZE select * from test where value>0.9999;
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0QUERY PLAN
> -------------------------------------------------------------------------=
---
> -----------------------------------------------
> =A0Index Scan using test_value_key on test =A0(cost=3D0.00..13.78 rows=3D=
105
> width=3D8) (actual time=3D0.010..155.318 rows=3D91 loops=3D1)
> =A0 Index Cond: (value > 0.9999::double precision)
> =A0Total runtime: 155.346 ms
> (3 rows)
>
> Oops... 160ms to return 90 rows from memory.
>
> In the same time 100 rows from other index side:
>
> postgres=3D# EXPLAIN ANALYZE select * from test where value<0.0001;
> =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0QUERY PLAN
> -------------------------------------------------------------------------=
---
> ----------------------------------------------
> =A0Index Scan using test_value_key on test =A0(cost=3D0.00..15.69 rows=3D=
120
> width=3D8) (actual time=3D0.006..0.158 rows=3D103 loops=3D1)
> =A0 Index Cond: (value < 0.0001::double precision)
> =A0Total runtime: 0.175 ms
> (3 rows)
>
> That is good result (1000 faster then other way around).
>
> For sure that can be fixed via create index with NOT NULL predicated. But
> may be that problem worth small investigation.
>
> Seems index scan cannot stop after finding first NULL during scan on '>'
> condition, and doing scan through all 90% nulls in table.

I can reproduce this.  I'm not sure whether it's a bug either, but it
sure seems less than ideal.  I suppose the problem is that we are
generating an index scan that starts at 0.9999 and runs through the
end of the index, rather than stopping when it hits the first NULL.
Not sure how much work it would be to make that happen, but I guess
we'd need a second branch to the index condition to stop the scan,
just as we already do for:

EXPLAIN (ANALYZE, BUFFERS) select * from test where value>0.9993 and
value <0.9999;

--=20
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

В списке pgsql-bugs по дате отправления:

Предыдущее
От: Robert Haas
Дата:
Сообщение: Re: BUG #6274: documentation on pg_attribute.atttypmod
Следующее
От: "Laurian Vostinar"
Дата:
Сообщение: BUG #6279: quoting needed for column name with non ascii chars