Postgres optimizer choosing wrong index
От | Jack Orenstein |
---|---|
Тема | Postgres optimizer choosing wrong index |
Дата | |
Msg-id | 49009AD6.9020902@hds.com обсуждение исходный текст |
Ответы |
Re: Postgres optimizer choosing wrong index
|
Список | pgsql-general |
I'm using postgres 7.4 and having a problem with the query optimizer. Our table, T, looks like this: dh int fh int nm int ... -- other columns A typical row is 400-500 bytes. T has two indexes, idx_df on (dh, fh) and idx_dn on (dh, nm). My query is select * from T where dh = ? and fh = ? If I run EXPLAIN on this query, (plugging in values 1 and 2 for the variables), before VACUUM ANALYZE, I get the desired execution plan: Index Scan using idx_df on T (cost=0.00..4.83 rows=1 width=454) Index Cond: ((dh = 1) AND (fh = 2)) But after VACUUM ANALYZE: Index Scan using idx_dn on T (cost=0.00..5.27 rows=1 width=561) Index Cond: (dh = 1) Filter: (fh = 2) Notice that postgres is now using the other index. This behavior is somewhat dependent on the values plugged in. I ran a query to count dh values: select dir_hash, count(*) from external_file group by dir_hash; dh | count ------------+-------- 916645488 | 20000 1057692240 | 200000 And if I use 1057692240 in the EXPLAIN, I get the desired plan: Index Scan using idx_df on external_file (cost=0.00..5.27 rows=1 width=561) Index Cond: ((dir_hash = 1057692240) AND (fn_hash = 2)) I've tried playing with various cost settings (e.g. random_page_cost), but have been unable to influence the optimizer's behavior. Rewriting the query as ... where (dh, fh) = (?, ?) doesn't help. So a few questions: - Why would the optimizer ever choose idx_dn over idx_df given that idx_df has to be more selective? - Is there any way to force the use of idx_df? Jack Orenstein P.S. Yes, I know, 7.4. We're upgrading to 8.3 but we have this problem right now.
В списке pgsql-general по дате отправления: