Re: BUG #13824: EXISTS sometimes uses seq scan instead of index
От | Kevin Grittner |
---|---|
Тема | Re: BUG #13824: EXISTS sometimes uses seq scan instead of index |
Дата | |
Msg-id | CACjxUsNp4uEx3xsunw4wVpBDVomas7o6hnv_49bSbaz-HAVdyA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: BUG #13824: EXISTS sometimes uses seq scan instead of index (Grzegorz Garlewicz <grzegorz@thulium.pl>) |
Список | pgsql-bugs |
On Mon, Dec 21, 2015 at 9:06 AM, Grzegorz Garlewicz <grzegorz@thulium.pl> wrote: > Could you please take a look at this one once again? > > On Fri, Dec 18, 2015 at 10:23 AM, Grzegorz Garlewicz <grzegorz@thulium.pl> wrote: >> >> I did just what you said - reduced random_page cost from 4 to 2 >> then 1 and then 0.5. It never really makes sense to set random_page_cost less than seq_page_cost; although setting them to equal values can make sense if your data is heavily cached. >> If I'm not mistaken, the issue seems to originate from the >> planner's thinking it needs to look up all the rows for EXISTS >> clause, not just a single one, so it thinks the cost would be >> much bigger. No, it does not think that. It knows that an EXISTS test can stop after finding a single matching row. What it does think is that values are fairly evenly distributed (i.e., if a value is 1% of a table you will only need to read about 100 rows before seeing one, rather than finding them all clumped at the end of the table) and that there is no correlation between values (i.e., any given id_status value is not more common for one id_outbound value than another). There is work in progress to try to allow for correlated values, and ideas on how to deal with uneven distribution. One thing that might help, in addition to reducing random_page_cost to be equal to or just above seq_page_cost, is to increase cpu_tuple_cost to something like 0.03 to 0.05. Benchmarking with real-world applications I was running showed better plans chosen with numbers in that range than with lower numbers. Of course, if the statistics are not up-to-date it doesn't have much chance of using accurate numbers and is likely to choose a bad plan. Make sure that autovacuum is tuned to be aggressive enough, and if there are big changes to a table you may want to use an explicit ANALYZE (or VACUUM ANALYZE) before running queries which reference data in the table. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
В списке pgsql-bugs по дате отправления: