Обсуждение: Optimize SQL
Hi, Is there anyway we can optimize this sql ? it is doing full table scan on listing and address table . Postgres version 8.0.2 Thanks! Pallav. explain analyze select listing0_.listingid as col_0_0_, getmaxdate(listing0_.lastupdate, max(addressval2_.createdate)) as col_1_0_ from listing.listing listing0_ left outer join listing.address listingadd1_ on listing0_.fkbestaddressid=listingadd1_.addressid left outer join listing.addressvaluation addressval2_ on listingadd1_.addressid=addressval2_.fkaddressid where listing0_.lastupdate>'2006-09-15 08:31:26.927' and listing0_.lastupdate<=current_timestamp or addressval2_.createdate>'2006-09-15 08:31:26.927' and addressval2_.createdate<=current_timestamp group by listing0_.listingid , listing0_.lastupdate order by getmaxdate(listing0_.lastupdate, max(addressval2_.createdate)) asc limit 10; Limit (cost=2399501.49..2399501.51 rows=10 width=20) (actual time=414298.076..414298.174 rows=10 loops=1) -> Sort (cost=2399501.49..2410707.32 rows=4482333 width=20) (actual time=414298.068..414298.098 rows=10 loops=1) Sort Key: getmaxdate(listing0_.lastupdate, max(addressval2_.createdate)) -> GroupAggregate (cost=1784490.47..1851725.47 rows=4482333 width=20) (actual time=414212.926..414284.927 rows=2559loops=1) -> Sort (cost=1784490.47..1795696.31 rows=4482333 width=20) (actual time=414174.678..414183.536 rows=2563loops=1) Sort Key: listing0_.listingid, listing0_.lastupdate -> Merge Right Join (cost=1113947.32..1236714.45 rows=4482333 width=20) (actual time=273257.256..414163.920rows=2563 loops=1) Merge Cond: ("outer".fkaddressid = "inner".addressid) Filter: ((("inner".lastupdate > '2006-09-15 08:31:26.927'::timestamp without time zone) AND ("inner".lastupdate<= ('now'::text)::timestamp(6) with time zone)) OR (("outer".createdate > '2006-09-15 08:31:26.927'::timestampwithout time zone) AND ("outer".createdate <= ('now'::text)::timestamp(6) with time zone))) -> Index Scan using idx_addressvaluation_fkaddressid on addressvaluation addressval2_ (cost=0.00..79769.55rows=947056 width=12) (actual time=0.120..108240.633 rows=960834 loops=1) -> Sort (cost=1113947.32..1125153.15 rows=4482333 width=16) (actual time=256884.646..275823.217rows=5669719 loops=1) Sort Key: listingadd1_.addressid -> Hash Left Join (cost=228115.38..570557.39 rows=4482333 width=16) (actual time=93874.356..205054.946rows=4490963 loops=1) Hash Cond: ("outer".fkbestaddressid = "inner".addressid) -> Seq Scan on listing listing0_ (cost=0.00..112111.33 rows=4482333 width=16) (actualtime=0.026..25398.685 rows=4490963 loops=1) -> Hash (cost=183333.70..183333.70 rows=6990270 width=4) (actual time=93873.659..93873.659rows=0 loops=1) -> Seq Scan on address listingadd1_ (cost=0.00..183333.70 rows=6990270 width=4)(actual time=13.256..69441.056 rows=6990606 loops=1)
Pallav Kalva <pkalva@livedatagroup.com> writes: > select listing0_.listingid as col_0_0_, > getmaxdate(listing0_.lastupdate, max(addressval2_.createdate)) as col_1_0_ > from listing.listing listing0_ > left outer join listing.address listingadd1_ > on listing0_.fkbestaddressid=listingadd1_.addressid > left outer join listing.addressvaluation addressval2_ > on listingadd1_.addressid=addressval2_.fkaddressid > where listing0_.lastupdate>'2006-09-15 08:31:26.927' > and listing0_.lastupdate<=current_timestamp > or addressval2_.createdate>'2006-09-15 08:31:26.927' and > addressval2_.createdate<=current_timestamp > group by listing0_.listingid , listing0_.lastupdate > order by getmaxdate(listing0_.lastupdate, max(addressval2_.createdate)) > asc limit 10; If that WHERE logic is actually what you need, then getting this query to run quickly seems pretty hopeless. The database must form the full outer join result: it cannot discard any listing0_ rows, even if they have lastupdate outside the given range, because they might join to addressval2_ rows within the given createdate range. And conversely it can't discard any addressval2_ rows early. Is there any chance that you wanted AND not OR there? One thing that might help a bit is to change the join order: from listing.listing listing0_ left outer join listing.addressvaluation addressval2_ on listing0_.fkbestaddressid=addressval2_.fkaddressid left outer join listing.address listingadd1_ on listing0_.fkbestaddressid=listingadd1_.addressid so that at least the WHERE clause can be applied before having joined to listingadd1_. The semantics of your ON clauses are probably wrong anyway --- did you think twice about what happens if there's no matching listingadd1_ entry? regards, tom lane
On 15-9-2006 17:53 Tom Lane wrote: > If that WHERE logic is actually what you need, then getting this query > to run quickly seems pretty hopeless. The database must form the full > outer join result: it cannot discard any listing0_ rows, even if they > have lastupdate outside the given range, because they might join to > addressval2_ rows within the given createdate range. And conversely > it can't discard any addressval2_ rows early. Is there any chance > that you wanted AND not OR there? Couldn't it also help to do something like this? SELECT ..., (SELECT MAX(createdate) FROM addressval ...) FROM listing l LEFT JOIN address ... WHERE l.id IN (SELECT id FROM listing WHERE lastupdate ... UNION SELECT id FROM listing JOIN addressval a ON ... WHERE a.createdate ...) Its not pretty, but looking at the explain only a small amount of records match both clauses. So this should allow the use of indexes for both the createdate-clause and the lastupdate-clause. Best regards, Arjen
That query is generated by hibernate, right? -----Original Message----- From: pgsql-performance-owner@postgresql.org [mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Pallav Kalva Sent: den 15 september 2006 17:10 To: pgsql-performance@postgresql.org Subject: [PERFORM] Optimize SQL Hi, Is there anyway we can optimize this sql ? it is doing full table scan on listing and address table . Postgres version 8.0.2 Thanks! Pallav. explain analyze select listing0_.listingid as col_0_0_, getmaxdate(listing0_.lastupdate, max(addressval2_.createdate)) as col_1_0_ from listing.listing listing0_ left outer join listing.address listingadd1_ on listing0_.fkbestaddressid=listingadd1_.addressid left outer join listing.addressvaluation addressval2_ on listingadd1_.addressid=addressval2_.fkaddressid where listing0_.lastupdate>'2006-09-15 08:31:26.927' and listing0_.lastupdate<=current_timestamp or addressval2_.createdate>'2006-09-15 08:31:26.927' and addressval2_.createdate<=current_timestamp group by listing0_.listingid , listing0_.lastupdate order by getmaxdate(listing0_.lastupdate, max(addressval2_.createdate)) asc limit 10; Limit (cost=2399501.49..2399501.51 rows=10 width=20) (actual time=414298.076..414298.174 rows=10 loops=1) -> Sort (cost=2399501.49..2410707.32 rows=4482333 width=20) (actual time=414298.068..414298.098 rows=10 loops=1) Sort Key: getmaxdate(listing0_.lastupdate, max(addressval2_.createdate)) -> GroupAggregate (cost=1784490.47..1851725.47 rows=4482333 width=20) (actual time=414212.926..414284.927 rows=2559 loops=1) -> Sort (cost=1784490.47..1795696.31 rows=4482333 width=20) (actual time=414174.678..414183.536 rows=2563 loops=1) Sort Key: listing0_.listingid, listing0_.lastupdate -> Merge Right Join (cost=1113947.32..1236714.45 rows=4482333 width=20) (actual time=273257.256..414163.920 rows=2563 loops=1) Merge Cond: ("outer".fkaddressid = "inner".addressid) Filter: ((("inner".lastupdate > '2006-09-15 08:31:26.927'::timestamp without time zone) AND ("inner".lastupdate <= ('now'::text)::timestamp(6) with time zone)) OR (("outer".createdate > '2006-09-15 08:31:26.927'::timestamp without time zone) AND ("outer".createdate <= ('now'::text)::timestamp(6) with time zone))) -> Index Scan using idx_addressvaluation_fkaddressid on addressvaluation addressval2_ (cost=0.00..79769.55 rows=947056 width=12) (actual time=0.120..108240.633 rows=960834 loops=1) -> Sort (cost=1113947.32..1125153.15 rows=4482333 width=16) (actual time=256884.646..275823.217 rows=5669719 loops=1) Sort Key: listingadd1_.addressid -> Hash Left Join (cost=228115.38..570557.39 rows=4482333 width=16) (actual time=93874.356..205054.946 rows=4490963 loops=1) Hash Cond: ("outer".fkbestaddressid = "inner".addressid) -> Seq Scan on listing listing0_ (cost=0.00..112111.33 rows=4482333 width=16) (actual time=0.026..25398.685 rows=4490963 loops=1) -> Hash (cost=183333.70..183333.70 rows=6990270 width=4) (actual time=93873.659..93873.659 rows=0 loops=1) -> Seq Scan on address listingadd1_ (cost=0.00..183333.70 rows=6990270 width=4) (actual time=13.256..69441.056 rows=6990606 loops=1) ---------------------------(end of broadcast)--------------------------- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to majordomo@postgresql.org so that your message can get through to the mailing list cleanly