Re: BUG #13824: EXISTS sometimes uses seq scan instead of index

Поиск

Список

Период

Сортировка

От	Kevin Grittner
Тема	Re: BUG #13824: EXISTS sometimes uses seq scan instead of index
Дата	22 декабря 2015 г. 15:49:10
Msg-id	CACjxUsNp4uEx3xsunw4wVpBDVomas7o6hnv_49bSbaz-HAVdyA@mail.gmail.com обсуждение исходный текст
Ответ на	Re: BUG #13824: EXISTS sometimes uses seq scan instead of index (Grzegorz Garlewicz <grzegorz@thulium.pl>)
Список	pgsql-bugs

Дерево обсуждения

On Mon, Dec 21, 2015 at 9:06 AM, Grzegorz Garlewicz <grzegorz@thulium.pl> wrote:
> Could you please take a look at this one once again?
>
> On Fri, Dec 18, 2015 at 10:23 AM, Grzegorz Garlewicz <grzegorz@thulium.pl> wrote:
>>
>> I did just what you said - reduced random_page cost from 4 to 2
>> then 1 and then 0.5.

It never really makes sense to set random_page_cost less than
seq_page_cost; although setting them to equal values can make sense
if your data is heavily cached.

>> If I'm not mistaken, the issue seems to originate from the
>> planner's thinking it needs to look up all the rows for EXISTS
>> clause, not just a single one, so it thinks the cost would be
>> much bigger.

No, it does not think that.  It knows that an EXISTS test can stop
after finding a single matching row.

What it does think is that values are fairly evenly distributed
(i.e., if a value is 1% of a table you will only need to read about
100 rows before seeing one, rather than finding them all clumped at
the end of the table) and that there is no correlation between
values (i.e., any given id_status value is not more common for one
id_outbound value than another).  There is work in progress to try
to allow for correlated values, and ideas on how to deal with
uneven distribution.

One thing that might help, in addition to reducing random_page_cost
to be equal to or just above seq_page_cost, is to increase
cpu_tuple_cost to something like 0.03 to 0.05.  Benchmarking with
real-world applications I was running showed better plans chosen
with numbers in that range than with lower numbers.

Of course, if the statistics are not up-to-date it doesn't have
much chance of using accurate numbers and is likely to choose a bad
plan.  Make sure that autovacuum is tuned to be aggressive enough,
and if there are big changes to a table you may want to use an
explicit ANALYZE (or VACUUM ANALYZE) before running queries which
reference data in the table.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

В списке pgsql-bugs по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: BUG #13824: EXISTS sometimes uses seq scan instead of index