Обсуждение: query planning and partitioned tables
Hi: I have a 440 million row table that I'm attempting to partition. The table is named 'pressure' and holds pressure data from an undersea sensor measuring wave height. The sensor reports every half second. The partitions are named 'p0', 'p1' ... 'pN' and are divvied according to measurement date. See the following: Table "public.pressure" Column | Type | Modifiers ------------------+-----------------------------+--------------------------------------------------------------- pressure_id | integer | not null default nextval(('pressure_id_seq'::text)::regclass) row_entry_date | timestamp with time zone | sensor_id | integer | not null measurement_date | timestamp without time zone | pressure | double precision | Indexes: "pressure_pkey" PRIMARY KEY, btree (pressure_id) "pressure_measurement_date_index" btree (measurement_date) Foreign-key constraints: "sensor_id_exists" FOREIGN KEY (sensor_id) REFERENCES sensor(sensor_id) ON DELETE RESTRICT Tablespace: "diskvol2" Table "public.p0" Column | Type | Modifiers ------------------+-----------------------------+--------------------------------------------------------------- pressure_id | integer | not null default nextval(('pressure_id_seq'::text)::regclass) row_entry_date | timestamp with time zone | sensor_id | integer | not null measurement_date | timestamp without time zone | pressure | double precision | Indexes: "p0_pressure_id_index" UNIQUE, btree (pressure_id), tablespace "diskvol2" "p0_measurement_date_index" btree (measurement_date), tablespace "diskvol2" Check constraints: "p0_measurement_date_check" CHECK (measurement_date >= '1999-08-02 00:00:00'::timestamp without time zone AND measurement_date < '2000-01-06 00:00:00'::timestamp without time zone) Foreign-key constraints: "sensor_id_exists" FOREIGN KEY (sensor_id) REFERENCES sensor(sensor_id) ON DELETE RESTRICT Inherits: pressure Tablespace: "diskvol2" My question involves how the database performs the following queries: explain select max(measurement_date) from pressure; QUERY PLAN ----------------------------------------------------------------------------------- Aggregate (cost=480311.50..480311.51 rows=1 width=8) -> Append (cost=0.00..425345.20 rows=21986520 width=8) -> Seq Scan on pressure (cost=0.00..22.30 rows=1230 width=8) -> Seq Scan on p0 pressure (cost=0.00..425322.90 rows=21985290 width=8) (4 rows) explain select max(measurement_date) from p0; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------- Result (cost=0.02..0.03 rows=1 width=0) InitPlan -> Limit (cost=0.00..0.02 rows=1 width=8) -> Index Scan Backward using p0_measurement_date_index on p0 (cost=0.00..531221.19 rows=21985290 width=8) Filter: (measurement_date IS NOT NULL) (5 rows) The optimizer doesn't do as well with the former as the latter. Is that the expected behavior or have I screwed things up? Also, consider the following: explain select * from pressure where measurement_date < '2000-01-01'; QUERY PLAN -------------------------------------------------------------------------------------------------------------- Result (cost=0.00..480293.25 rows=21092480 width=32) -> Append (cost=0.00..480293.25 rows=21092480 width=32) -> Index Scan using pressure_measurement_date_index on pressure (cost=0.00..7.13 rows=69 width=32) Index Cond: (measurement_date < '2000-01-01 00:00:00'::timestamp without time zone) -> Seq Scan on p0 pressure (cost=0.00..480286.12 rows=21092411 width=32) Filter: (measurement_date < '2000-01-01 00:00:00'::timestamp without time zone) (6 rows) explain select * from p0 where measurement_date < '2000-01-01'; QUERY PLAN ----------------------------------------------------------------------------------- Seq Scan on p0 (cost=0.00..480286.12 rows=21092411 width=32) Filter: (measurement_date < '2000-01-01 00:00:00'::timestamp without time zone) (2 rows) The query planner seems to be ignoring p0's index on measurement_date. Contrast that to the plan made for a similar query made on a different table (unpartitioned) in the same database: explain select * from wind where measurement_date < '2000-01-01'; QUERY PLAN ------------------------------------------------------------------------------------------------ Bitmap Heap Scan on wind (cost=474.17..28855.07 rows=75192 width=116) Recheck Cond: (measurement_date < '2000-01-01 00:00:00'::timestamp without time zone) -> Bitmap Index Scan on wind_measurement_date_index (cost=0.00..474.17 rows=75192 width=0) Index Cond: (measurement_date < '2000-01-01 00:00:00'::timestamp without time zone) (4 rows) Now, that looks like a nice plan (I guess). My question: Are my partitions constructed in such a way that prevents the planner from picking better plans? Or is that the way things stand right now? As always, thanks for your help.
On Thu, 2005-11-24 at 11:12 -0500, Colton Smith wrote: > My question involves how the database performs the following queries: > > explain select max(measurement_date) from pressure; > QUERY > PLAN > ----------------------------------------------------------------------------------- > Aggregate (cost=480311.50..480311.51 rows=1 width=8) > -> Append (cost=0.00..425345.20 rows=21986520 width=8) > -> Seq Scan on pressure (cost=0.00..22.30 rows=1230 width=8) > -> Seq Scan on p0 pressure (cost=0.00..425322.90 > rows=21985290 width=8) > (4 rows) > > explain select max(measurement_date) from p0; > QUERY > PLAN > ----------------------------------------------------------------------------------------------------------------------- > Result (cost=0.02..0.03 rows=1 width=0) > InitPlan > -> Limit (cost=0.00..0.02 rows=1 width=8) > -> Index Scan Backward using p0_measurement_date_index on > p0 (cost=0.00..531221.19 rows=21985290 width=8) > Filter: (measurement_date IS NOT NULL) > (5 rows) > > The optimizer doesn't do as well with the former as the latter. Is that > the expected behavior or have I screwed things up? The optimization for the latter query was new in 8.1, as was the optimization for constraint_exclusion. We didn't manage to meet in the middle and make both optimizations work simultaneously, yet. > Also, consider the following: > > explain select * from pressure where measurement_date < '2000-01-01'; > QUERY > PLAN > -------------------------------------------------------------------------------------------------------------- > Result (cost=0.00..480293.25 rows=21092480 width=32) > -> Append (cost=0.00..480293.25 rows=21092480 width=32) > -> Index Scan using pressure_measurement_date_index on > pressure (cost=0.00..7.13 rows=69 width=32) > Index Cond: (measurement_date < '2000-01-01 > 00:00:00'::timestamp without time zone) > -> Seq Scan on p0 pressure (cost=0.00..480286.12 > rows=21092411 width=32) > Filter: (measurement_date < '2000-01-01 > 00:00:00'::timestamp without time zone) > (6 rows) > > explain select * from p0 where measurement_date < '2000-01-01'; > QUERY > PLAN > ----------------------------------------------------------------------------------- > Seq Scan on p0 (cost=0.00..480286.12 rows=21092411 width=32) > Filter: (measurement_date < '2000-01-01 00:00:00'::timestamp without > time zone) > (2 rows) > > The query planner seems to be ignoring p0's index on measurement_date. No. It's ignoring pressure's index, which it should cos you didn't access the pressure table in the second query. > Contrast that to the plan made for a similar query made on a different > table (unpartitioned) in the same database: > > explain select * from wind where measurement_date < '2000-01-01'; > QUERY > PLAN > ------------------------------------------------------------------------------------------------ > Bitmap Heap Scan on wind (cost=474.17..28855.07 rows=75192 width=116) > Recheck Cond: (measurement_date < '2000-01-01 00:00:00'::timestamp > without time zone) > -> Bitmap Index Scan on wind_measurement_date_index > (cost=0.00..474.17 rows=75192 width=0) > Index Cond: (measurement_date < '2000-01-01 > 00:00:00'::timestamp without time zone) > (4 rows) > > Now, that looks like a nice plan (I guess). > > My question: Are my partitions constructed in such a way that prevents > the planner from picking better plans? Or is that > the way things stand right now? No and No. Partitions can use indexes, but the same rules apply as normal - an index will only be used if appropriate. Just cos it works on Wind, doesn't mean it'll work on Pressure. Hope that helps. Good luck with your experiments. These questions belong on the PERFORM list... You should also be looking at EXPLAIN ANALYZE output, which would show up some of the non-differences in the plan fairly well. Best Regards, Simon Riggs