Slight change in query leads to unexpected change in query plan
От | Jack Orenstein |
---|---|
Тема | Slight change in query leads to unexpected change in query plan |
Дата | |
Msg-id | 4A3FFDD0.8020809@hds.com обсуждение исходный текст |
Ответы |
Re: Slight change in query leads to unexpected change in query plan
|
Список | pgsql-general |
I have a table that looks like this: create table T(pk int not null, value bytea, ..., primary key(pk)) I want to scan the table in batches of 100. I'll do this by issuing a sequence of queries like this: select * from T where pk > ? and value = ? order by pk limit 100 After each query, I'll record the last value seen and use that to drive the next query. The obvious (to me) execution plan is to use the index, do an index scan, and then filter using the restriction on value. I have some realistic data (loaded into postgres 8.3.7) and I've run analyze. I'm not getting a very good execution plan: ris=# explain ris-# select * ris-# from T ris-# where pk > 1000000000 ris-# and value = 'asdf'::bytea ris-# order by pk ris-# limit 100; QUERY PLAN --------------------------------------------------------------------------------------------- Limit (cost=78352.20..78352.24 rows=16 width=451) -> Sort (cost=78352.20..78352.24 rows=16 width=451) Sort Key: pk -> Bitmap Heap Scan on t (cost=2091.60..78351.88 rows=16 width=451) Recheck Cond: (pk > 1000000000) Filter: (value = 'asdf'::bytea) -> Bitmap Index Scan on t_pkey (cost=0.00..2091.60 rows=91088 width=0) Index Cond: (pk > 1000000000) But if I remove the value restriction, I get the plan I was hoping for: ris=# explain ris-# select * ris-# from T ris-# where pk > 1000000000 ris-# order by pk ris-# limit 100; QUERY PLAN -------------------------------------------------------------------------------------------------- Limit (cost=0.00..324.99 rows=100 width=451) -> Index Scan using t_pkey on t (cost=0.00..296027.98 rows=91088 width=451) Index Cond: (pk > 1000000000) (3 rows) Why is this? This is an obvious rewrite, e.g. select * from (select * from T where pk > ? order by pk limit 100) x where value = ? and this produces a good query plan. But this means that fewer than 100 rows are returned. For reasons too boring to go into, that would be very inconvenient for my application. Why does adding the value restriction so radically change the execution plan? Jack Orenstein
В списке pgsql-general по дате отправления: