Re: Planner estimates cost of 'like' a lot lower than '='??
От | Tom Lane |
---|---|
Тема | Re: Planner estimates cost of 'like' a lot lower than '='?? |
Дата | |
Msg-id | 8764.995840089@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Planner estimates cost of 'like' a lot lower than '='?? (Mats Lofkvist <mal@algonet.se>) |
Ответы |
Re: Planner estimates cost of 'like' a lot lower than '='??
|
Список | pgsql-general |
Mats Lofkvist <mal@algonet.se> writes: > EXPLAIN > testdb=> select count(*) from data where value >= 'test' and value < 'tesu'; > count > ------- > 10000 > (1 row) > testdb=> explain select count(*) from data where value >= 'test' and value < 'tesu'; > NOTICE: QUERY PLAN: > Aggregate (cost=4.46..4.46 rows=1 width=0) > -> Index Scan using datavalueindex on data (cost=0.00..4.45 rows=1 width=0) Sure enough, this is the source of the speed problem: when using the index on "value", each iteration of the find1 scan will be indexscanning 10000 tuples to find the single one that passes the LIKE and other qualifications. But the planner mistakenly thinks that the indexscan will find only one tuple, and so it has no reason to prefer the other index over this one. (Unfortunately, the planner is too stupid to realize that the other index *guarantees* to return no more than one tuple for this query, and hence should be preferred over a mere statistical estimate of one selected tuple. Not sure how we could incorporate such a consideration into what's fundamentally a cost-estimate-driven process.) I think that current sources will probably do a lot better on the range estimation problem. I'll be interested to see what you get from these same tests when you have the data loaded into current... regards, tom lane
В списке pgsql-general по дате отправления: