GROUP BY with reasonable timings in PLAN but unreasonable execution time
От | Clem Dickey |
---|---|
Тема | GROUP BY with reasonable timings in PLAN but unreasonable execution time |
Дата | |
Msg-id | iv0h49$1var$1@news.hub.org обсуждение исходный текст |
Ответы |
Re: GROUP BY with reasonable timings in PLAN but unreasonable execution
time
|
Список | pgsql-performance |
I have a query which seems to be taking an extraordinarily long time (many minutes, at least) when seemingly equivalent queries have different plans and execute in seconds. naturally, I'd like to know why. Version is Postgresql 8.4.8. The table, "t", is Column | Type | Modifiers --------+---------+----------- y | integer | not null x | integer | not null k | integer | not null j | integer | not null z | integer | not null Indexes: "t_pkey" PRIMARY KEY, btree (j, k, x, y, z) The table population, in pseudocode, is this: for x in 0..9 for y in 0..9999 for z in 0..29 INSERT INTO t VALUES(y,x,0,0,z) So the table has 300000 entries, with j and k always 0. The query is: SELECT * FROM ( SELECT * FROM t GROUP BY j,k,x,z,y ) AS f NATURAL JOIN t; The plan: Merge Join (cost=44508.90..66677.96 rows=1 width=20) Merge Cond: ((public.t.j = public.t.j) AND (public.t.k = public.t.k) AND (public.t.x = public.t.x)) Join Filter: ((public.t.y = public.t.y) AND (public.t.z = public.t.z)) -> Group (cost=44508.90..49008.90 rows=30000 width=20) -> Sort (cost=44508.90..45258.90 rows=300000 width=20) Sort Key: public.t.j, public.t.k, public.t.x, public.t.z, public.t.y -> Seq Scan on t (cost=0.00..4911.00 rows=300000 width=20) -> Index Scan using t_pkey on t (cost=0.00..14877.18 rows=300000 width=20) This query runs at least 20 minutes, with postmaster CPU utilization at 99%, without completing. System is a 3.2GHz Zeon, 3GB memory, and not much else running. By contrast, placing an intermediate result in a table "u" provides a result in about 3 seconds: CREATE TEMPORARY TABLE u AS SELECT * FROM t GROUP BY j,k,x,z,y; SELECT * FROM u NATURAL JOIN t; Changing the order of the GROUP BY clause varies the plan, sometimes yielding shorter execution times. For example, this ordering executes in about 1.5 seconds: SELECT * FROM ( SELECT * FROM t GROUP BY j,k,x,y,z ) AS f NATURAL JOIN t; With 120 permutations, I didn't try them all. I should note that the plans tend to have similar costs, so the query planner presumably does not know that some permutations have significantly greater execution times. Clem Dickey
В списке pgsql-performance по дате отправления: