Обсуждение: Planner picking topsey turvey plan?
Hi people, Does anyone know how I can change what I'm doing to get pgsql to pick a better plan? I'll explain what I've done below but please forgive me if I interpret the plans wrong as I try to describe, I've split itinto 4 points to try and ease the mess of pasting in the plans.. 1) I've created a view "orders" that joins two tables "credit" and "mult_ord" together as below: CREATE VIEW orders AS SELECT b.mult_ref, a.show, MIN(a.transno) AS "lead_transno", COUNT(a.transno) AS "parts", SUM(a.tickets) AS "items", SUM(a.value)AS "value" FROM (credit a LEFT OUTER JOIN mult_ord b ON a.transno = b.transno) GROUP BY b.mult_ref, a.show; 2) And an explain on that view comes out as below, it's using the correct index for the field show on "credit" which doesn'tlook too bad to me: DB=# explain select a.artist, a.date, b.mult_ref, b.items, b.parts from (show a inner join orders b on a.code = b.show) where b.show = 357600; QUERY PLAN -------------------------------------------------------------------------------------------------------- Nested Loop (cost=15050.79..15099.68 rows=1013 width=70) -> Index Scan using show_index01 on show a (cost=0.00..8.37 rows=1 width=26) Index Cond: (code = 357600::numeric) -> HashAggregate (cost=15050.79..15071.05 rows=1013 width=39) -> Nested Loop Left Join (cost=0.00..15035.60 rows=1013 width=39) -> Index Scan using credit_index04 on credit a (cost=0.00..4027.30 rows=1013 width=31) Index Cond: (show = 357600::numeric) -> Index Scan using mult_ord_index02 on mult_ord b (cost=0.00..10.85 rows=1 width=17) Index Cond: (a.transno = b.transno) (9 rows) 3) Then I have a table called "show" that is indexed on the artist field, and a plan for listing the shows for an artistis as below, again this doesn't look too bad to me, as it's using the index on artist. DB=# explain select * from show where artist = 'ALKALINE TRIO'; QUERY PLAN ----------------------------------------------------------------------------- Bitmap Heap Scan on show (cost=9.59..582.41 rows=153 width=348) Recheck Cond: ((artist)::text = 'ALKALINE TRIO'::text) -> Bitmap Index Scan on show_index07 (cost=0.00..9.56 rows=153 width=0) Index Cond: ((artist)::text = 'ALKALINE TRIO'::text) (4 rows) 4) So.. I guess I can join "show" -> "orders", expecting an index scan on "show" for the artist, then an index scan on "orders"for each show. However it seems the planner has other ideas, it just looks backwards to me: DB=# explain select a.artist, a.date, b.mult_ref, b.items, b.parts from (show a inner join orders b on a.code = b.show) where artist = 'ALKALINE TRIO'; QUERY PLAN ---------------------------------------------------------------------------------------------------- Hash Join (cost=1576872.96..1786175.37 rows=1689 width=70) Hash Cond: (a.show = a.code) -> GroupAggregate (cost=1576288.64..1729424.39 rows=4083620 width=39) -> Sort (cost=1576288.64..1586497.69 rows=4083620 width=39) Sort Key: b.mult_ref, a.show -> Hash Left Join (cost=321406.05..792886.22 rows=4083620 width=39) Hash Cond: (a.transno = b.transno) -> Seq Scan on credit a (cost=0.00..267337.20 rows=4083620 width=31) -> Hash (cost=160588.80..160588.80 rows=8759380 width=17) -> Seq Scan on mult_ord b (cost=0.00..160588.80 rows=8759380 width=17) -> Hash (cost=582.41..582.41 rows=153 width=26) -> Bitmap Heap Scan on show a (cost=9.59..582.41 rows=153 width=26) Recheck Cond: ((artist)::text = 'ALKALINE TRIO'::text) -> Bitmap Index Scan on show_index07 (cost=0.00..9.56 rows=153 width=0) Index Cond: ((artist)::text = 'ALKALINE TRIO'::text) (15 rows) Any idea if I can get around this?
Anyone? --- On Fri, 5/12/08, Glyn Astill <glynastill@yahoo.co.uk> wrote: > From: Glyn Astill <glynastill@yahoo.co.uk> > Subject: [GENERAL] Planner picking topsey turvey plan? > To: pgsql-general@postgresql.org > Date: Friday, 5 December, 2008, 2:23 PM > Hi people, > > Does anyone know how I can change what I'm doing to get > pgsql to pick a better plan? > > I'll explain what I've done below but please > forgive me if I interpret the plans wrong as I try to > describe, I've split it into 4 points to try and ease > the mess of pasting in the plans.. > > > 1) I've created a view "orders" that joins > two tables "credit" and "mult_ord" > together as below: > > CREATE VIEW orders AS > SELECT b.mult_ref, a.show, MIN(a.transno) AS > "lead_transno", COUNT(a.transno) AS > "parts", SUM(a.tickets) AS "items", > SUM(a.value) AS "value" > FROM (credit a LEFT OUTER JOIN mult_ord b ON a.transno = > b.transno) > GROUP BY b.mult_ref, a.show; > > > > 2) And an explain on that view comes out as below, it's > using the correct index for the field show on > "credit" which doesn't look too bad to me: > > DB=# explain select a.artist, a.date, b.mult_ref, b.items, > b.parts from (show a inner join orders b on a.code = b.show) > where b.show = 357600; > QUERY PLAN > -------------------------------------------------------------------------------------------------------- > Nested Loop (cost=15050.79..15099.68 rows=1013 width=70) > -> Index Scan using show_index01 on show a > (cost=0.00..8.37 rows=1 width=26) > Index Cond: (code = 357600::numeric) > -> HashAggregate (cost=15050.79..15071.05 rows=1013 > width=39) > -> Nested Loop Left Join (cost=0.00..15035.60 > rows=1013 width=39) > -> Index Scan using credit_index04 on > credit a (cost=0.00..4027.30 rows=1013 width=31) > Index Cond: (show = 357600::numeric) > -> Index Scan using mult_ord_index02 on > mult_ord b (cost=0.00..10.85 rows=1 width=17) > Index Cond: (a.transno = b.transno) > (9 rows) > > > > 3) Then I have a table called "show" that is > indexed on the artist field, and a plan for listing the > shows for an artist is as below, again this doesn't look > too bad to me, as it's using the index on artist. > > DB=# explain select * from show where artist = > 'ALKALINE TRIO'; > QUERY PLAN > ----------------------------------------------------------------------------- > Bitmap Heap Scan on show (cost=9.59..582.41 rows=153 > width=348) > Recheck Cond: ((artist)::text = 'ALKALINE > TRIO'::text) > -> Bitmap Index Scan on show_index07 > (cost=0.00..9.56 rows=153 width=0) > Index Cond: ((artist)::text = 'ALKALINE > TRIO'::text) > (4 rows) > > > > 4) So.. I guess I can join "show" -> > "orders", expecting an index scan on > "show" for the artist, then an index scan on > "orders" for each show. > > However it seems the planner has other ideas, it just looks > backwards to me: > > DB=# explain select a.artist, a.date, b.mult_ref, b.items, > b.parts from (show a inner join orders b on a.code = b.show) > where artist = 'ALKALINE TRIO'; > QUERY PLAN > ---------------------------------------------------------------------------------------------------- > Hash Join (cost=1576872.96..1786175.37 rows=1689 > width=70) > Hash Cond: (a.show = a.code) > -> GroupAggregate (cost=1576288.64..1729424.39 > rows=4083620 width=39) > -> Sort (cost=1576288.64..1586497.69 > rows=4083620 width=39) > Sort Key: b.mult_ref, a.show > -> Hash Left Join > (cost=321406.05..792886.22 rows=4083620 width=39) > Hash Cond: (a.transno = b.transno) > -> Seq Scan on credit a > (cost=0.00..267337.20 rows=4083620 width=31) > -> Hash > (cost=160588.80..160588.80 rows=8759380 width=17) > -> Seq Scan on mult_ord b > (cost=0.00..160588.80 rows=8759380 width=17) > -> Hash (cost=582.41..582.41 rows=153 width=26) > -> Bitmap Heap Scan on show a > (cost=9.59..582.41 rows=153 width=26) > Recheck Cond: ((artist)::text = > 'ALKALINE TRIO'::text) > -> Bitmap Index Scan on show_index07 > (cost=0.00..9.56 rows=153 width=0) > Index Cond: ((artist)::text = > 'ALKALINE TRIO'::text) > (15 rows) > > Any idea if I can get around this? > > > > > > > -- > Sent via pgsql-general mailing list > (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general
what does explain analyze yourqueryhere say? On Sat, Dec 6, 2008 at 1:33 PM, Glyn Astill <glynastill@yahoo.co.uk> wrote: > Anyone? > > > --- On Fri, 5/12/08, Glyn Astill <glynastill@yahoo.co.uk> wrote: > >> From: Glyn Astill <glynastill@yahoo.co.uk> >> Subject: [GENERAL] Planner picking topsey turvey plan? >> To: pgsql-general@postgresql.org >> Date: Friday, 5 December, 2008, 2:23 PM >> Hi people, >> >> Does anyone know how I can change what I'm doing to get >> pgsql to pick a better plan? >> >> I'll explain what I've done below but please >> forgive me if I interpret the plans wrong as I try to >> describe, I've split it into 4 points to try and ease >> the mess of pasting in the plans.. >> >> >> 1) I've created a view "orders" that joins >> two tables "credit" and "mult_ord" >> together as below: >> >> CREATE VIEW orders AS >> SELECT b.mult_ref, a.show, MIN(a.transno) AS >> "lead_transno", COUNT(a.transno) AS >> "parts", SUM(a.tickets) AS "items", >> SUM(a.value) AS "value" >> FROM (credit a LEFT OUTER JOIN mult_ord b ON a.transno = >> b.transno) >> GROUP BY b.mult_ref, a.show; >> >> >> >> 2) And an explain on that view comes out as below, it's >> using the correct index for the field show on >> "credit" which doesn't look too bad to me: >> >> DB=# explain select a.artist, a.date, b.mult_ref, b.items, >> b.parts from (show a inner join orders b on a.code = b.show) >> where b.show = 357600; >> QUERY PLAN >> -------------------------------------------------------------------------------------------------------- >> Nested Loop (cost=15050.79..15099.68 rows=1013 width=70) >> -> Index Scan using show_index01 on show a >> (cost=0.00..8.37 rows=1 width=26) >> Index Cond: (code = 357600::numeric) >> -> HashAggregate (cost=15050.79..15071.05 rows=1013 >> width=39) >> -> Nested Loop Left Join (cost=0.00..15035.60 >> rows=1013 width=39) >> -> Index Scan using credit_index04 on >> credit a (cost=0.00..4027.30 rows=1013 width=31) >> Index Cond: (show = 357600::numeric) >> -> Index Scan using mult_ord_index02 on >> mult_ord b (cost=0.00..10.85 rows=1 width=17) >> Index Cond: (a.transno = b.transno) >> (9 rows) >> >> >> >> 3) Then I have a table called "show" that is >> indexed on the artist field, and a plan for listing the >> shows for an artist is as below, again this doesn't look >> too bad to me, as it's using the index on artist. >> >> DB=# explain select * from show where artist = >> 'ALKALINE TRIO'; >> QUERY PLAN >> ----------------------------------------------------------------------------- >> Bitmap Heap Scan on show (cost=9.59..582.41 rows=153 >> width=348) >> Recheck Cond: ((artist)::text = 'ALKALINE >> TRIO'::text) >> -> Bitmap Index Scan on show_index07 >> (cost=0.00..9.56 rows=153 width=0) >> Index Cond: ((artist)::text = 'ALKALINE >> TRIO'::text) >> (4 rows) >> >> >> >> 4) So.. I guess I can join "show" -> >> "orders", expecting an index scan on >> "show" for the artist, then an index scan on >> "orders" for each show. >> >> However it seems the planner has other ideas, it just looks >> backwards to me: >> >> DB=# explain select a.artist, a.date, b.mult_ref, b.items, >> b.parts from (show a inner join orders b on a.code = b.show) >> where artist = 'ALKALINE TRIO'; >> QUERY PLAN >> ---------------------------------------------------------------------------------------------------- >> Hash Join (cost=1576872.96..1786175.37 rows=1689 >> width=70) >> Hash Cond: (a.show = a.code) >> -> GroupAggregate (cost=1576288.64..1729424.39 >> rows=4083620 width=39) >> -> Sort (cost=1576288.64..1586497.69 >> rows=4083620 width=39) >> Sort Key: b.mult_ref, a.show >> -> Hash Left Join >> (cost=321406.05..792886.22 rows=4083620 width=39) >> Hash Cond: (a.transno = b.transno) >> -> Seq Scan on credit a >> (cost=0.00..267337.20 rows=4083620 width=31) >> -> Hash >> (cost=160588.80..160588.80 rows=8759380 width=17) >> -> Seq Scan on mult_ord b >> (cost=0.00..160588.80 rows=8759380 width=17) >> -> Hash (cost=582.41..582.41 rows=153 width=26) >> -> Bitmap Heap Scan on show a >> (cost=9.59..582.41 rows=153 width=26) >> Recheck Cond: ((artist)::text = >> 'ALKALINE TRIO'::text) >> -> Bitmap Index Scan on show_index07 >> (cost=0.00..9.56 rows=153 width=0) >> Index Cond: ((artist)::text = >> 'ALKALINE TRIO'::text) >> (15 rows) >> >> Any idea if I can get around this? >> >> >> >> >> >> >> -- >> Sent via pgsql-general mailing list >> (pgsql-general@postgresql.org) >> To make changes to your subscription: >> http://www.postgresql.org/mailpref/pgsql-general > > > > > -- > Sent via pgsql-admin mailing list (pgsql-admin@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-admin > -- When fascism comes to America, it will be draped in a flag and carrying a cross - Sinclair Lewis
Explain analyze below, DB=# explain analyze select a.artist, a.date, b.mult_ref, b.items, b.parts from (show a inner join orders b on a.code = b.show)where artist = 'ALKALINE TRIO'; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------- Hash Join (cost=1583955.94..1794350.36 rows=1552 width=70) (actual time=231496.678..243243.711 rows=892 loops=1) Hash Cond: (a.show = a.code) -> GroupAggregate (cost=1583418.91..1737354.68 rows=4104954 width=40) (actual time=223204.620..241813.067 rows=2856379loops=1) -> Sort (cost=1583418.91..1593681.29 rows=4104954 width=40) (actual time=223204.567..231296.896 rows=4104964 loops=1) Sort Key: b.mult_ref, a.show Sort Method: external merge Disk: 224328kB -> Hash Left Join (cost=321999.38..795776.58 rows=4104954 width=40) (actual time=14850.320..165804.778 rows=4104964loops=1) Hash Cond: (a.transno = b.transno) -> Seq Scan on credit a (cost=0.00..268740.54 rows=4104954 width=31) (actual time=7.563..76901.901rows=4104954 loops=1) -> Hash (cost=160885.28..160885.28 rows=8775528 width=18) (actual time=14831.810..14831.810 rows=8775528loops=1) -> Seq Scan on mult_ord b (cost=0.00..160885.28 rows=8775528 width=18) (actual time=4.716..4952.254rows=8775528 loops=1) -> Hash (cost=535.28..535.28 rows=140 width=26) (actual time=228.599..228.599 rows=54 loops=1) -> Bitmap Heap Scan on show a (cost=9.49..535.28 rows=140 width=26) (actual time=77.723..228.488 rows=54 loops=1) Recheck Cond: ((artist)::text = 'ALKALINE TRIO'::text) -> Bitmap Index Scan on show_index07 (cost=0.00..9.46 rows=140 width=0) (actual time=62.228..62.228 rows=54loops=1) Index Cond: ((artist)::text = 'ALKALINE TRIO'::text) Total runtime: 243367.640 ms --- On Sat, 6/12/08, Scott Marlowe <scott.marlowe@gmail.com> wrote: > From: Scott Marlowe <scott.marlowe@gmail.com> > Subject: Re: [ADMIN] [GENERAL] Planner picking topsey turvey plan? > To: glynastill@yahoo.co.uk > Cc: pgsql-general@postgresql.org, pgsql-admin@postgresql.org > Date: Saturday, 6 December, 2008, 8:35 PM > what does explain analyze yourqueryhere say? > > On Sat, Dec 6, 2008 at 1:33 PM, Glyn Astill > <glynastill@yahoo.co.uk> wrote: > > Anyone? > > > > > > --- On Fri, 5/12/08, Glyn Astill > <glynastill@yahoo.co.uk> wrote: > > > >> From: Glyn Astill <glynastill@yahoo.co.uk> > >> Subject: [GENERAL] Planner picking topsey turvey > plan? > >> To: pgsql-general@postgresql.org > >> Date: Friday, 5 December, 2008, 2:23 PM > >> Hi people, > >> > >> Does anyone know how I can change what I'm > doing to get > >> pgsql to pick a better plan? > >> > >> I'll explain what I've done below but > please > >> forgive me if I interpret the plans wrong as I try > to > >> describe, I've split it into 4 points to try > and ease > >> the mess of pasting in the plans.. > >> > >> > >> 1) I've created a view "orders" that > joins > >> two tables "credit" and > "mult_ord" > >> together as below: > >> > >> CREATE VIEW orders AS > >> SELECT b.mult_ref, a.show, MIN(a.transno) AS > >> "lead_transno", COUNT(a.transno) AS > >> "parts", SUM(a.tickets) AS > "items", > >> SUM(a.value) AS "value" > >> FROM (credit a LEFT OUTER JOIN mult_ord b ON > a.transno = > >> b.transno) > >> GROUP BY b.mult_ref, a.show; > >> > >> > >> > >> 2) And an explain on that view comes out as below, > it's > >> using the correct index for the field show on > >> "credit" which doesn't look too bad > to me: > >> > >> DB=# explain select a.artist, a.date, b.mult_ref, > b.items, > >> b.parts from (show a inner join orders b on a.code > = b.show) > >> where b.show = 357600; > >> > QUERY PLAN > >> > -------------------------------------------------------------------------------------------------------- > >> Nested Loop (cost=15050.79..15099.68 rows=1013 > width=70) > >> -> Index Scan using show_index01 on show a > >> (cost=0.00..8.37 rows=1 width=26) > >> Index Cond: (code = 357600::numeric) > >> -> HashAggregate (cost=15050.79..15071.05 > rows=1013 > >> width=39) > >> -> Nested Loop Left Join > (cost=0.00..15035.60 > >> rows=1013 width=39) > >> -> Index Scan using > credit_index04 on > >> credit a (cost=0.00..4027.30 rows=1013 width=31) > >> Index Cond: (show = > 357600::numeric) > >> -> Index Scan using > mult_ord_index02 on > >> mult_ord b (cost=0.00..10.85 rows=1 width=17) > >> Index Cond: (a.transno = > b.transno) > >> (9 rows) > >> > >> > >> > >> 3) Then I have a table called "show" > that is > >> indexed on the artist field, and a plan for > listing the > >> shows for an artist is as below, again this > doesn't look > >> too bad to me, as it's using the index on > artist. > >> > >> DB=# explain select * from show where artist = > >> 'ALKALINE TRIO'; > >> QUERY PLAN > >> > ----------------------------------------------------------------------------- > >> Bitmap Heap Scan on show (cost=9.59..582.41 > rows=153 > >> width=348) > >> Recheck Cond: ((artist)::text = 'ALKALINE > >> TRIO'::text) > >> -> Bitmap Index Scan on show_index07 > >> (cost=0.00..9.56 rows=153 width=0) > >> Index Cond: ((artist)::text = > 'ALKALINE > >> TRIO'::text) > >> (4 rows) > >> > >> > >> > >> 4) So.. I guess I can join "show" -> > >> "orders", expecting an index scan on > >> "show" for the artist, then an index > scan on > >> "orders" for each show. > >> > >> However it seems the planner has other ideas, it > just looks > >> backwards to me: > >> > >> DB=# explain select a.artist, a.date, b.mult_ref, > b.items, > >> b.parts from (show a inner join orders b on a.code > = b.show) > >> where artist = 'ALKALINE TRIO'; > >> QUERY > PLAN > >> > ---------------------------------------------------------------------------------------------------- > >> Hash Join (cost=1576872.96..1786175.37 rows=1689 > >> width=70) > >> Hash Cond: (a.show = a.code) > >> -> GroupAggregate > (cost=1576288.64..1729424.39 > >> rows=4083620 width=39) > >> -> Sort (cost=1576288.64..1586497.69 > >> rows=4083620 width=39) > >> Sort Key: b.mult_ref, a.show > >> -> Hash Left Join > >> (cost=321406.05..792886.22 rows=4083620 width=39) > >> Hash Cond: (a.transno = > b.transno) > >> -> Seq Scan on credit a > >> (cost=0.00..267337.20 rows=4083620 width=31) > >> -> Hash > >> (cost=160588.80..160588.80 rows=8759380 width=17) > >> -> Seq Scan on > mult_ord b > >> (cost=0.00..160588.80 rows=8759380 width=17) > >> -> Hash (cost=582.41..582.41 rows=153 > width=26) > >> -> Bitmap Heap Scan on show a > >> (cost=9.59..582.41 rows=153 width=26) > >> Recheck Cond: ((artist)::text = > >> 'ALKALINE TRIO'::text) > >> -> Bitmap Index Scan on > show_index07 > >> (cost=0.00..9.56 rows=153 width=0) > >> Index Cond: ((artist)::text = > >> 'ALKALINE TRIO'::text) > >> (15 rows) > >> > >> Any idea if I can get around this? > >> > >> > >> > >> > >> > >> > >> -- > >> Sent via pgsql-general mailing list > >> (pgsql-general@postgresql.org) > >> To make changes to your subscription: > >> http://www.postgresql.org/mailpref/pgsql-general > > > > > > > > > > -- > > Sent via pgsql-admin mailing list > (pgsql-admin@postgresql.org) > > To make changes to your subscription: > > http://www.postgresql.org/mailpref/pgsql-admin > > > > > > -- > When fascism comes to America, it will be draped in a flag > and > carrying a cross - Sinclair Lewis > > -- > Sent via pgsql-general mailing list > (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general
Glyn Astill <glynastill@yahoo.co.uk> writes: > Does anyone know how I can change what I'm doing to get pgsql to pick a better plan? You've provided no evidence that this is a bad plan. In particular, the plan you seem to think would be better would involve an estimated 153 iterations of the cost-15071 hash aggregation, which simple arithmetic shows is more expensive than the plan it did choose. regards, tom lane
> From: Tom Lane <tgl@sss.pgh.pa.us> > > You've provided no evidence that this is a bad plan. > Looks like I didn't take the time to understand properly what the explains were showing. > In particular, the plan you seem to think would be better > would involve > an estimated 153 iterations of the cost-15071 hash > aggregation, which > simple arithmetic shows is more expensive than the plan it > did choose. > I'd totally missed that all the cost was in the view that I'd created. Thanks tom