Re: Query Optimization

Поиск
Список
Период
Сортировка
От Tom Lane
Тема Re: Query Optimization
Дата
Msg-id 11053.1243438443@sss.pgh.pa.us
обсуждение исходный текст
Ответ на Re: Query Optimization  (Sean Davis <sdavis2@mail.nih.gov>)
Ответы Re: Query Optimization  (Zach Calvert <zachcalvert@hemerasoftware.com>)
Список pgsql-novice
Sean Davis <sdavis2@mail.nih.gov> writes:
>> zachcalvert@hemerasoftware.com> wrote:
>>> I have a query and I have run
>>> explain analyze
>>> select count(*)
>>> from score
>>> where leaderboardid=35 and score <= 6841 and active
>>>
>>> The result is
>>> "Aggregate  (cost=2491.06..2491.07 rows=1 width=0) (actual
>>> time=38.878..38.878 rows=1 loops=1)"
>>> "  ->  Seq Scan on score  (cost=0.00..2391.17 rows=39954 width=0)
>>> (actual time=0.012..30.760 rows=38571 loops=1)"
>>> "        Filter: (active AND (score <= 6841) AND (leaderboardid = 35))"
>>> "Total runtime: 38.937 ms"
>>>
>>> I have an index on score, I have an index on score, leaderboardid, and
>>> active and still it does a sequential scan.

> Postgresql is aware of the "cost" associated with each query.  In the case
> of a small table with an index that is not very discriminative, it may
> choose a sequential scan.  However, as you add more rows, the index scan may
> become more effective and may be used instead.  One thing to keep in mind is
> that an index scan is NOT always faster than a sequential scan.

A crude rule of thumb is that you need the query to fetch less than
ten percent of the rows before a bitmap scan is going to be a win,
and less than one percent before a plain indexscan is going to be a win.
(If your database is entirely cached in memory then the crossover
percentages are higher, and you need to adjust the planner's cost
parameters so that it gets this right.)  It's not clear exactly how
big this table is, but I'm betting the query is fetching more than
ten percent of it.

One point worth making is that if this is the typical set of conditions
in your queries, then the best index would be one on (leaderboardid,
score) not (score, leaderboardid, active).  (I'm betting that the
condition active = true is so nonselective it's not worth keeping it in
the index at all.)  You want equality conditions on the leading
column(s) and inequalities on the trailing columns.  To see why this is,
think about the index sort ordering and the portion of the index that
the query will have to scan.  In the latter case the set of index
entries matching this query is a contiguous group; in the former, not.

Our fine manual has a reasonable amount of detail about proper index
design:
http://www.postgresql.org/docs/8.3/static/indexes.html

            regards, tom lane

В списке pgsql-novice по дате отправления:

Предыдущее
От: Joshua Tolley
Дата:
Сообщение: Re: [GENERAL] Postgres registry access using java
Следующее
От: Thomas Kellerer
Дата:
Сообщение: Re: Tool for modeling