Re: ORDER BY, LIMIT and indexes

Поиск
Список
Период
Сортировка
От Claudio Freire
Тема Re: ORDER BY, LIMIT and indexes
Дата
Msg-id CAGTBQpYnm9WYQ=3F2VhL1ZsTXeQQg2ujQbxeMnSgeLhkpV67SQ@mail.gmail.com
обсуждение исходный текст
Ответ на Re: ORDER BY, LIMIT and indexes  (Mark Kirkwood <mark.kirkwood@catalyst.net.nz>)
Ответы Re: ORDER BY, LIMIT and indexes  (Claudio Freire <klaussfreire@gmail.com>)
Список pgsql-performance
On Tue, Aug 6, 2013 at 7:56 PM, Mark Kirkwood
<mark.kirkwood@catalyst.net.nz> wrote:
> Hmm - I wonder if the lack or ORDER BY is part of the problem here. Consider
> a similar query on pgbench_accounts:
>
> bench=# explain analyze select aid from pgbench_accounts where aid > 100000
> limit 20;
> QUERY PLAN
>
-----------------------------------------------------------------------------------------------------------------------------
>  Limit  (cost=0.00..0.91 rows=20 width=4) (actual time=0.005..0.464 rows=20
> loops=1)
>    ->  Seq Scan on pgbench_accounts (cost=0.00..499187.31 rows=10994846
> width=4) (actual time=0.005..0.463 rows=20 loops=1)
>          Filter: (aid > 100000)
>  Total runtime: 0.474 ms
> (4 rows)
>
> bench=# explain analyze select aid from pgbench_accounts where aid >
> 10000000 limit 20;
> QUERY PLAN
>
----------------------------------------------------------------------------------------------------------------------------------------------------------
>  Limit  (cost=0.00..2.25 rows=20 width=4) (actual time=0.014..0.018 rows=20
> loops=1)
>    ->  Index Scan using pgbench_accounts_pkey on pgbench_accounts
> (cost=0.00..207204.06 rows=1844004 width=4) (actual time=0.014..0.017
> rows=20 loops=1)
>          Index Cond: (aid > 10000000)
>  Total runtime: 0.030 ms
> (4 rows)
>
>
> So at some point you get index scans. Now add an ORDER BY:
>
> bench=# explain analyze select aid from pgbench_accounts where aid > 100000
> order by aid limit 20;
> QUERY PLAN
>
>
----------------------------------------------------------------------------------------------------------------------------------------------------------
> --
>  Limit  (cost=0.00..2.25 rows=20 width=4) (actual time=0.008..0.012 rows=20
> loops=1)
>    ->  Index Scan using pgbench_accounts_pkey on pgbench_accounts
> (cost=0.00..1235355.34 rows=10994846 width=4) (actual time=0.008..0.011
> rows=20 loops=1
> )
>          Index Cond: (aid > 100000)
>  Total runtime: 0.023 ms
> (4 rows)
>
> bench=# explain analyze select aid from pgbench_accounts where aid >
> 10000000 order by aid limit 20;
> QUERY PLAN
>
----------------------------------------------------------------------------------------------------------------------------------------------------------
>  Limit  (cost=0.00..2.25 rows=20 width=4) (actual time=0.014..0.018 rows=20
> loops=1)
>    ->  Index Scan using pgbench_accounts_pkey on pgbench_accounts
> (cost=0.00..207204.06 rows=1844004 width=4) (actual time=0.014..0.016
> rows=20 loops=1)
>          Index Cond: (aid > 10000000)
>  Total runtime: 0.029 ms
> (4 rows)
>
>
> ...and we have index scans for both cases.
>
> Cheers
>
> Mark

Yes, but those index scans decisions are driven by the wrong factor.
In the last two cases, the need for rows to be ordered. In the second
case, the estimated number of tuples in the scan.

In both cases, that's not the driving factor for the right decision.
The driving factor *should* be startup cost, which is nonzero because
there is a filter being applied to that sequential scan that filters
many of the initial tuples. With a nonzero startup cost, the cost of
the limited plan would be "startup cost + scan cost * scanned
fraction". When scanned fraction is low enough, startup cost dominates
the equation.

With a min/max index, a cheap query to that index could estimate at
least a lower bound to that initial zero-rows-output cost. With b-tree
indexes, not so much (the b-tree would have to be traversed until the
first filter-passing tuple, which could be a while).


В списке pgsql-performance по дате отправления:

Предыдущее
От: Mark Kirkwood
Дата:
Сообщение: Re: ORDER BY, LIMIT and indexes
Следующее
От: Claudio Freire
Дата:
Сообщение: Re: ORDER BY, LIMIT and indexes