Обсуждение: Bug #681: creating indices after data insertion

Поиск
Список
Период
Сортировка

Bug #681: creating indices after data insertion

От
pgsql-bugs@postgresql.org
Дата:
Markus Grill (mg@schwarz.de) reports a bug with a severity of 2
The lower the number the more severe it is.

Short Description
creating indices after data insertion

Long Description
Hello,

in my opinion there is a big bug with indexes.
I used Postgresql 7.1.3 and 7.2.1

First I create two tables t_a and t_b with the fields t_a.id, t_a.bid and t_b.id where t_a.bid is a foreign key for
t_b.id.
Then I insert many data records in the two tables.
After that I create one index i1 on t_a.id, one index i2 on t_a.bid and one index i3 on t_b.id.

When I select over just one of the two tables (EXPLAIN SELECT * FROM t_a WHERE t_a.id = 'XXX';) then the query plan
(andthe query) uses index scans. 

But when I select over the two tables with a join (EXPLAIN SELECT * FROM t_a, t_b WHERE t_aid = 'XXX' AND t_a_bid =
t_b_id;)then the query plan uses slow seq scans (That's not OK)! 

We tried another thing:
First creating the tables,
then creating the indexes
and then inserting the data records:
In this case the query uses index scans (that's OK)!

We tried a third thing:
Creating the tables, creating the indexes, inserting the data records:
---> index scans (OK).
Then dropping the indexes and creating three new indexes:
---> seq scans (not OK).

Please do not say, I should create my indexes first, because this IS a bug.
That's a big problem when inserting many data records, because it's much faster to drop the index first, then inserting
thedata and then creating new indexes. 

Please help! :-)

Sample Code
My real life examle:
The first example is the bug the second is the running one:

testrpr=# EXPLAIN SELECT * from inf_information, auct_sys_infogruppe where inf_alias = 'asdfasdf' and infgrp_id =
auct_sys_infogruppe.inf_id;
NOTICE:  QUERY PLAN:

Hash Join  (cost=5.95..8.86 rows=1 width=1070)
  ->  Seq Scan on auct_sys_infogruppe  (cost=0.00..2.61 rows=61 width=555)
  ->  Hash  (cost=5.94..5.94 rows=1 width=515)
        ->  Index Scan using xak1inf_information on inf_information  (cost=0.00..5.94 rows=1 width=515)

NOTICE:  QUERY PLAN:

Hash Join  (cost=5.95..8.86 rows=1 width=1070)
  ->  Seq Scan on auct_sys_infogruppe  (cost=0.00..2.61 rows=61 width=555)
  ->  Hash  (cost=5.94..5.94 rows=1 width=515)
        ->  Index Scan using xak1inf_information on inf_information  (cost=0.00..5.94 rows=1 width=515)

EXPLAIN

------------------------------------------

testrpr=# EXPLAIN SELECT * from inf_information, auct_sys_infogruppe where inf_alias = 'asdfasdf' and infgrp_id =
auct_sys_infogruppe.inf_id;
NOTICE:  QUERY PLAN:

Nested Loop  (cost=0.00..9.66 rows=1 width=1070)
  ->  Index Scan using xak1inf_information on inf_information  (cost=0.00..4.82 rows=1 width=515)
  ->  Index Scan using xpkauct_sys_infogruppe on auct_sys_infogruppe  (cost=0.00..4.82 rows=1 width=555)

NOTICE:  QUERY PLAN:

Nested Loop  (cost=0.00..9.66 rows=1 width=1070)
  ->  Index Scan using xak1inf_information on inf_information  (cost=0.00..4.82 rows=1 width=515)
  ->  Index Scan using xpkauct_sys_infogruppe on auct_sys_infogruppe  (cost=0.00..4.82 rows=1 width=555)

EXPLAIN

No file was uploaded with this report

Re: Bug #681: creating indices after data insertion

От
Dmitry Tkach
Дата:
Did you vacuum it?

I hope, it helps...

Dima
pgsql-bugs@postgresql.org wrote:

>Markus Grill (mg@schwarz.de) reports a bug with a severity of 2
>The lower the number the more severe it is.
>
>Short Description
>creating indices after data insertion
>
>Long Description
>Hello,
>
>in my opinion there is a big bug with indexes.
>I used Postgresql 7.1.3 and 7.2.1
>
>First I create two tables t_a and t_b with the fields t_a.id, t_a.bid and t_b.id where t_a.bid is a foreign key for
t_b.id.
>Then I insert many data records in the two tables.
>After that I create one index i1 on t_a.id, one index i2 on t_a.bid and one index i3 on t_b.id.
>
>When I select over just one of the two tables (EXPLAIN SELECT * FROM t_a WHERE t_a.id = 'XXX';) then the query plan
(andthe query) uses index scans. 
>
>But when I select over the two tables with a join (EXPLAIN SELECT * FROM t_a, t_b WHERE t_aid = 'XXX' AND t_a_bid =
t_b_id;)then the query plan uses slow seq scans (That's not OK)! 
>
>We tried another thing:
>First creating the tables,
>then creating the indexes
>and then inserting the data records:
>In this case the query uses index scans (that's OK)!
>
>We tried a third thing:
>Creating the tables, creating the indexes, inserting the data records:
>---> index scans (OK).
>Then dropping the indexes and creating three new indexes:
>---> seq scans (not OK).
>
>Please do not say, I should create my indexes first, because this IS a bug.
>That's a big problem when inserting many data records, because it's much faster to drop the index first, then
insertingthe data and then creating new indexes. 
>
>Please help! :-)
>
>Sample Code
>My real life examle:
>The first example is the bug the second is the running one:
>
>testrpr=# EXPLAIN SELECT * from inf_information, auct_sys_infogruppe where inf_alias = 'asdfasdf' and infgrp_id =
auct_sys_infogruppe.inf_id;
>NOTICE:  QUERY PLAN:
>
>Hash Join  (cost=5.95..8.86 rows=1 width=1070)
>  ->  Seq Scan on auct_sys_infogruppe  (cost=0.00..2.61 rows=61 width=555)
>  ->  Hash  (cost=5.94..5.94 rows=1 width=515)
>        ->  Index Scan using xak1inf_information on inf_information  (cost=0.00..5.94 rows=1 width=515)
>
>NOTICE:  QUERY PLAN:
>
>Hash Join  (cost=5.95..8.86 rows=1 width=1070)
>  ->  Seq Scan on auct_sys_infogruppe  (cost=0.00..2.61 rows=61 width=555)
>  ->  Hash  (cost=5.94..5.94 rows=1 width=515)
>        ->  Index Scan using xak1inf_information on inf_information  (cost=0.00..5.94 rows=1 width=515)
>
>EXPLAIN
>
>------------------------------------------
>
>testrpr=# EXPLAIN SELECT * from inf_information, auct_sys_infogruppe where inf_alias = 'asdfasdf' and infgrp_id =
auct_sys_infogruppe.inf_id;
>NOTICE:  QUERY PLAN:
>
>Nested Loop  (cost=0.00..9.66 rows=1 width=1070)
>  ->  Index Scan using xak1inf_information on inf_information  (cost=0.00..4.82 rows=1 width=515)
>  ->  Index Scan using xpkauct_sys_infogruppe on auct_sys_infogruppe  (cost=0.00..4.82 rows=1 width=555)
>
>NOTICE:  QUERY PLAN:
>
>Nested Loop  (cost=0.00..9.66 rows=1 width=1070)
>  ->  Index Scan using xak1inf_information on inf_information  (cost=0.00..4.82 rows=1 width=515)
>  ->  Index Scan using xpkauct_sys_infogruppe on auct_sys_infogruppe  (cost=0.00..4.82 rows=1 width=555)
>
>EXPLAIN
>
>No file was uploaded with this report
>
>
>---------------------------(end of broadcast)---------------------------
>TIP 2: you can get off all lists at once with the unregister command
>    (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
>

Re: Bug #681: creating indices after data insertion

От
Tom Lane
Дата:
Dmitry Tkach <dmitry@openratings.com> writes:
> Did you vacuum it?

Even more to the point: vacuum analyze.

The described behavior sounds like the default assumptions in the
absence of any ANALYZE statistics ...

            regards, tom lane