Re: [HACKERS] Costs: Index vs Non-Index
От | Tom Lane |
---|---|
Тема | Re: [HACKERS] Costs: Index vs Non-Index |
Дата | |
Msg-id | 6112.947348116@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Costs: Index vs Non-Index (The Hermit Hacker <scrappy@hub.org>) |
Ответы |
Re: [HACKERS] Costs: Index vs Non-Index
|
Список | pgsql-hackers |
The Hermit Hacker <scrappy@hub.org> writes: > Hrmmm...if I'm reading this right, its more costly to create an index then > to leave it as a sequential scan, but it returns more rows? Yet, it > returns, if I do the query with a count() around the return value, 288 > rows, not 334 or 1154... This doesn't have anything to do with index vs sequential scan, but it does have to do with whether you've done a VACUUM ANALYZE lately. You haven't ;-) > udmsearch=> explain select next_index_time from url where next_index_time <= 947317073; > NOTICE: QUERY PLAN: > Seq Scan on url (cost=43.00 rows=334 width=4) IIRC, rows=334 is the default estimate of result rows you will get for this query in the absence of any information whatever. (Default table size guess is 1000 rows, and default selectivity guess for <= is 1/3, so...) If you have not vacuumed, it's sheer coincidence that this is even within hailing distance of the correct figure of 288. > udmsearch=> create index url_next_index_time on url using btree ( next_index_time); > CREATE > udmsearch=> explain select next_index_time from url where next_index_time <= 947317073; > NOTICE: QUERY PLAN: > Index Scan using url_next_index_time on url (cost=271.68 rows=1154 width=4) I believe that a side-effect of CREATE INDEX is to update the number-of-pages-and-rows statistics in pg_class for the target table. So after you do that, the optimizer has a correct idea of the table's size, but still no more info about the selectivity of the WHERE clause. (I infer that your table has size 1154*3 rows.) If you now drop the index and repeat EXPLAIN, it'll go back to a seq scan, but it will now say 1154 rows --- and the cost estimate will be higher, too. If you do VACUUM ANALYZE, then the optimizer will also know the min and max values of next_index_time, and will have some shot at making a correct estimate of the output row count. I'd be interested to know what it predicts then... regards, tom lane
В списке pgsql-hackers по дате отправления: