Re: Bad query plan inside EXISTS clause
От | Benoit Delbosc |
---|---|
Тема | Re: Bad query plan inside EXISTS clause |
Дата | |
Msg-id | 4B97CFFA.6030000@nuxeo.com обсуждение исходный текст |
Ответ на | Re: Bad query plan inside EXISTS clause (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-performance |
Tom Lane a écrit : > Benoit Delbosc <bdelbosc@nuxeo.com> writes: >> I am trying to understand why inside an EXISTS clause the query planner >> does not use the index: > > I'm not sure this plan is as bad as all that. The key point is that the > planner is expecting 52517 rows that match that users_md5 value (and the > true number is evidently 51446, so that estimate isn't far off). That's > about 1/48th of the table. It knows that the EXISTS case can stop as > soon as it finds one match, so it's betting that a plain seqscan will > hit a match faster than an index lookup would be able to, ie, > seqscanning about 48 tuples is faster than one index lookup. This might > be a bad bet if the users_md5 values are correlated with physical order, > ie the matches are not randomly scattered but are all towards the end of > the table. exact, the data is not randomly scattered but ordered this explains why in my case seq scan is a bad bet Barring that, though, it could be a good bet if the table > isn't swapped in. Which is what the default cost parameters are set > up to assume. there are lots of shared buffers and effective memory on this instance, the query is executed many times I can assume that the table isn't swapped in right ? > I suspect your real complaint is that you expect the table to be swapped > in, in which case what you ought to be doing is adjusting the planner's > cost parameters. Some playing around here with a similar case suggests > that even a small reduction in random_page_cost would make it prefer an > indexscan for this type of situation. excellent ! Changing the random_page_cost from 4 to 2 do the trick SET random_page_cost = 2; EXPLAIN ANALYZE SELECT 1 WHERE EXISTS (SELECT 1 FROM read_acls_cache WHERE users_md5 = '9bc9012eb29c0bb2ae3cc7b5e78c2acf'); QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------- Result (cost=1.06..1.07 rows=1 width=0) (actual time=0.048..0.048 rows=1 loops=1) One-Time Filter: $0 InitPlan 1 (returns $0) -> Index Scan using read_acls_cache_users_md5_idx on read_acls_cache (cost=0.00..55664.21 rows=52517 width=0) (actual time=0.045..0.045 rows=1 loops=1) Index Cond: ((users_md5)::text = '9bc9012eb29c0bb2ae3cc7b5e78c2acf'::text) Total runtime: 0.087 ms (6 rows) 466/0.087 = 5360 thanks ! kind regards ben
В списке pgsql-performance по дате отправления: