Index question
От | Troy D |
---|---|
Тема | Index question |
Дата | |
Msg-id | 3D386642.9000501@yahoo.ca обсуждение исходный текст |
Список | pgadmin-support |
Hi, I am running pg7.1 on SuSE 7.2. I have a table with three varchar fields making up the primary key, plus four data fields. create table "foo"( "pk1" varchar(7) not null, "pk2" varchar(7) not null, "pk3" varchar(7) not null, "data1" float8 default 0 not null, "data2" float8 default 0 not null, "data3" float8 default 0 not null, "data4" float8 default 0 not null constraint "foo_pkey" primary key ("pk1", "pk2", "pk3") ); This automatically creates an index (so says pgadmin) create unique index foo_pkey on foo using btree ( pk1 varchar_ops, pk2 varchar_ops, pk3 varchar_ops ); On this table, I run an insert/update routine that reads from table bar and inserts or updates table foo. I have used php to read in (about 2000) records from table bar and then insert or update table foo, then print the record number (of result from bar) and the time. I was getting about one transaction (each bar record requires [4 selects, an insert and an update]*64) every two to four seconds, getting obviously progressively slower with each batch. So, I did something stupid(?) - I created an index: create index foo_index2 on foo using btree ( pk1 varchar_ops, pk2 varchar_ops, pk3 varchar_ops ); and now I get better than one transaction per second, with minor degradation from one batch to the next. Why? The second index is hardly different than the first. (This is the opposite of what Josh Berkus recommended doing in 'The Joy of Indexing'. ) Did the index not get created in the first place? Or did the stats get updated when I created the second index? Other info: The empty table foo was vacuum'd before the first batch insert. Not vacuum analyze'd. The index was created halfway through the first batch - the time stamp shows the interval change. I wiped the table (delete from foo), removed the index and did it again, hence, repeatable at least twice. Before the index was created, an 'explain' on the insert statement showed a seq scan on the table... Insert into foo select 'pk1value', 'pk2value', 'pk3value', 0, 0, 0, 0 where (select count(*) from foo where pk1 = 'pk1value' and pk2 = 'pk2value' and pk3 = 'pk3value') = 0 i.e don't insert if it's already there. After the index was created, an 'explain' on the insert statement showed an index scan on the table using foo_index2. Version 0.01 of the query took three and a half days, but applying Bruce Momjian's optimizing techniques (PostgreSQL Hardware Performance Tuning) and other optimizing techniques got it down to three hours. Adding the rogue index dropped it to an hour. Any other suggestions would be greatly appreciated. Thanks in advance, Troy ______________________________________________________________________ Post your ad for free now! http://personals.yahoo.caa
В списке pgadmin-support по дате отправления: