Обсуждение: extend statistics help reduce index scan a lot of shared buffer hits.
Postgresql 14.8, Redhat8. looks like have to create extend statistics on indexed and joined columns to make join filters pushed down to secondary index scan in nestloop, and the shared buffer hits show big difference.
is it expected ?
SELECT ....
FROM
mtgxxxxxxxx a LEFT OUTER JOIN mtgxxxxxxxext b ON a.sssid = b.sssid and a.MMMUUID = b.MMMUUID and a.uuid = b.uuid
WHERE
a.SSSID=$1
AND a.MMMUID=$2
ORDER BY a.XXXX asc
offset 300 rows
FETCH FIRST 51 ROWS ONLY
explain (analyze,buffers) slowsql1(...)
1. with default, join filters just after nestloop,
Limit (cost=5.61..5.62 rows=1 width=1169) (actual time=2249.443..2249.454 rows=51 loops=1)
Buffers: shared hit=3864917
-> Sort (cost=5.61..5.61 rows=1 width=1169) (actual time=2249.404..2249.438 rows=351 loops=1)
Sort Key: a.email
Sort Method: top-N heapsort Memory: 174kB
Buffers: shared hit=3864917
-> Nested Loop Left Join (cost=1.12..5.60 rows=1 width=1169) (actual time=1.335..2246.971 rows=2142 loops=1)
Join Filter: ((a.siteid = b.siteid) AND ((a.mtguuid)::text = (b.mtguuid)::text) AND (a.uuid = b.uuid))
Rows Removed by Join Filter: 4586022
Buffers: shared hit=3864917
-> Index Scan using idx_mmmcfattlist_mmmuuid_f on mtgxxxxxxxx a (cost=0.56..2.79 rows=1 width=1093) (actual time=0.026..5.318 rows
=2142 loops=1)
Index Cond: ((sssid = $1) AND ((mmmuuid)::text = ($2)::text))
Filter: ((joinstatus = ANY (ARRAY[$3, $4])) OR ((usertype)::text = 'Testlist'::text))
Buffers: shared hit=2891
-> Index Scan using idx_mtgattndlstext_mmmuuid_uid on mtgxxxxxxxext b (cost=0.56..2.78 rows=1 width=133) (actual time=0.016..0.698
rows=2142 loops=2142)
Index Cond: ((sssid = $1) AND ((mmmuuid)::text = ($2)::text))
Buffers: shared hit=3862026 <<< here huge shared hits.
Planning Time: 0.033 ms
Execution Time: 2249.527 ms
create statistics mtgxxxxxxext_sssid_mmmuuid(dependencies,ndistinct) on sssid, mmmuuid from mtgxxxxxxxext.
analyze mtgxxxxxxxext.
2. join filters pushed down to secondary index scan, and reduce a lot of shared blks access.
-------------------------------------------------------------------------------------------------------------------------------------------------------------
------------------------
Limit (cost=5.61..5.62 rows=1 width=1245) (actual time=12.371..12.380 rows=51 loops=1)
Buffers: shared hit=12865
-> Sort (cost=5.61..5.61 rows=1 width=1245) (actual time=12.333..12.364 rows=351 loops=1)
Sort Key: a.email
Sort Method: top-N heapsort Memory: 174kB
Buffers: shared hit=12865
-> Nested Loop Left Join (cost=1.12..5.60 rows=1 width=1245) (actual time=0.042..10.819 rows=2142 loops=1)
Buffers: shared hit=12865
-> Index Scan using idx_mmmcfattlist_mmmuuid_f on mtgxxxxxxxx a (cost=0.56..2.79 rows=1 width=1169) (actual time=0.025..2.492
rows=2142 loops=1)
Index Cond: ((sssid = $1) AND ((mmmuuid)::text = ($2)::text))
Filter: ((joinstatus = ANY (ARRAY[$3, $4])) OR ((usertype)::text = 'Testlist'::text))
Buffers: shared hit=2891
-> Index Scan using idx_mtgattndlstext_mmmuuid_uid on mtgxxxxxxxext b (cost=0.56..2.79 rows=1 width=133) (actual time=0.003..0
.003 rows=1 loops=2142)
Index Cond: ((sssid = a.sssid) AND (sssid = $1) AND ((mmmuuid)::text = (a.mmmuuid)::text) AND ((mmmuuid)::text = ($2)::text) AND (uui
d = a.uuid))
Buffers: shared hit=10710 <<< here much less shared hits , and Index Cond automatically added sssid = a.sssid ,mmuuid=a.mmmuuid.
Planning Time: 0.021 ms
Execution Time: 12.451 ms
(17 rows)
FROM
mtgxxxxxxxx a LEFT OUTER JOIN mtgxxxxxxxext b ON a.sssid = b.sssid and a.MMMUUID = b.MMMUUID and a.uuid = b.uuid
WHERE
a.SSSID=$1
AND a.MMMUID=$2
ORDER BY a.XXXX asc
offset 300 rows
FETCH FIRST 51 ROWS ONLY
explain (analyze,buffers) slowsql1(...)
1. with default, join filters just after nestloop,
Limit (cost=5.61..5.62 rows=1 width=1169) (actual time=2249.443..2249.454 rows=51 loops=1)
Buffers: shared hit=3864917
-> Sort (cost=5.61..5.61 rows=1 width=1169) (actual time=2249.404..2249.438 rows=351 loops=1)
Sort Key: a.email
Sort Method: top-N heapsort Memory: 174kB
Buffers: shared hit=3864917
-> Nested Loop Left Join (cost=1.12..5.60 rows=1 width=1169) (actual time=1.335..2246.971 rows=2142 loops=1)
Join Filter: ((a.siteid = b.siteid) AND ((a.mtguuid)::text = (b.mtguuid)::text) AND (a.uuid = b.uuid))
Rows Removed by Join Filter: 4586022
Buffers: shared hit=3864917
-> Index Scan using idx_mmmcfattlist_mmmuuid_f on mtgxxxxxxxx a (cost=0.56..2.79 rows=1 width=1093) (actual time=0.026..5.318 rows
=2142 loops=1)
Index Cond: ((sssid = $1) AND ((mmmuuid)::text = ($2)::text))
Filter: ((joinstatus = ANY (ARRAY[$3, $4])) OR ((usertype)::text = 'Testlist'::text))
Buffers: shared hit=2891
-> Index Scan using idx_mtgattndlstext_mmmuuid_uid on mtgxxxxxxxext b (cost=0.56..2.78 rows=1 width=133) (actual time=0.016..0.698
rows=2142 loops=2142)
Index Cond: ((sssid = $1) AND ((mmmuuid)::text = ($2)::text))
Buffers: shared hit=3862026 <<< here huge shared hits.
Planning Time: 0.033 ms
Execution Time: 2249.527 ms
create statistics mtgxxxxxxext_sssid_mmmuuid(dependencies,ndistinct) on sssid, mmmuuid from mtgxxxxxxxext.
analyze mtgxxxxxxxext.
2. join filters pushed down to secondary index scan, and reduce a lot of shared blks access.
-------------------------------------------------------------------------------------------------------------------------------------------------------------
------------------------
Limit (cost=5.61..5.62 rows=1 width=1245) (actual time=12.371..12.380 rows=51 loops=1)
Buffers: shared hit=12865
-> Sort (cost=5.61..5.61 rows=1 width=1245) (actual time=12.333..12.364 rows=351 loops=1)
Sort Key: a.email
Sort Method: top-N heapsort Memory: 174kB
Buffers: shared hit=12865
-> Nested Loop Left Join (cost=1.12..5.60 rows=1 width=1245) (actual time=0.042..10.819 rows=2142 loops=1)
Buffers: shared hit=12865
-> Index Scan using idx_mmmcfattlist_mmmuuid_f on mtgxxxxxxxx a (cost=0.56..2.79 rows=1 width=1169) (actual time=0.025..2.492
rows=2142 loops=1)
Index Cond: ((sssid = $1) AND ((mmmuuid)::text = ($2)::text))
Filter: ((joinstatus = ANY (ARRAY[$3, $4])) OR ((usertype)::text = 'Testlist'::text))
Buffers: shared hit=2891
-> Index Scan using idx_mtgattndlstext_mmmuuid_uid on mtgxxxxxxxext b (cost=0.56..2.79 rows=1 width=133) (actual time=0.003..0
.003 rows=1 loops=2142)
Index Cond: ((sssid = a.sssid) AND (sssid = $1) AND ((mmmuuid)::text = (a.mmmuuid)::text) AND ((mmmuuid)::text = ($2)::text) AND (uui
d = a.uuid))
Buffers: shared hit=10710 <<< here much less shared hits , and Index Cond automatically added sssid = a.sssid ,mmuuid=a.mmmuuid.
Planning Time: 0.021 ms
Execution Time: 12.451 ms
(17 rows)
Thanks,
James
Postgresql 14.8, Redhat8. looks like have to create extend statistics on indexed and joined columns to make join filters pushed down to secondary index scan in nestloop, and the shared buffer hits show big difference.
is it expected ?
SELECT ....
FROM
mtgxxxxxxxx a LEFT OUTER JOIN mtgxxxxxxxext b ON a.sssid = b.sssid and a.MMMUUID = b.MMMUUID and a.uuid = b.uuid
WHERE
a.SSSID=$1
AND a.MMMUID=$2
ORDER BY a.XXXX asc
offset 300 rows
FETCH FIRST 51 ROWS ONLY
explain (analyze,buffers) slowsql1(...)
1. with default, join filters just after nestloop,
Limit (cost=5.61..5.62 rows=1 width=1169) (actual time=2249.443..2249.454 rows=51 loops=1)
Buffers: shared hit=3864917
-> Sort (cost=5.61..5.61 rows=1 width=1169) (actual time=2249.404..2249.438 rows=351 loops=1)
Sort Key: a.email
Sort Method: top-N heapsort Memory: 174kB
Buffers: shared hit=3864917
-> Nested Loop Left Join (cost=1.12..5.60 rows=1 width=1169) (actual time=1.335..2246.971 rows=2142 loops=1)
Join Filter: ((a.sssid = b.sssid) AND ((a.mmmuuid)::text = (b.mmmuuid)::text) AND (a.uuid = b.uuid))
Rows Removed by Join Filter: 4586022
Buffers: shared hit=3864917
-> Index Scan using idx_mmmcfattlist_mmmuuid_f on mtgxxxxxxxx a (cost=0.56..2.79 rows=1 width=1093) (actual time=0.026..5.318 rows
=2142 loops=1)
Index Cond: ((sssid = $1) AND ((mmmuuid)::text = ($2)::text))
Filter: ((joinstatus = ANY (ARRAY[$3, $4])) OR ((usertype)::text = 'Testlist'::text))
Buffers: shared hit=2891
-> Index Scan using idx_mtgattndlstext_mmmuuid_uid on mtgxxxxxxxext b (cost=0.56..2.78 rows=1 width=133) (actual time=0.016..0.698
rows=2142 loops=2142)
Index Cond: ((sssid = $1) AND ((mmmuuid)::text = ($2)::text))
Buffers: shared hit=3862026 <<< here huge shared hits.
Planning Time: 0.033 ms
Execution Time: 2249.527 ms
create statistics mtgxxxxxxext_sssid_mmmuuid(dependencies,ndistinct) on sssid, mmmuuid from mtgxxxxxxxext.
analyze mtgxxxxxxxext.
2. join filters pushed down to secondary index scan, and reduce a lot of shared blks access.
-------------------------------------------------------------------------------------------------------------------------------------------------------------
------------------------
Limit (cost=5.61..5.62 rows=1 width=1245) (actual time=12.371..12.380 rows=51 loops=1)
Buffers: shared hit=12865
-> Sort (cost=5.61..5.61 rows=1 width=1245) (actual time=12.333..12.364 rows=351 loops=1)
Sort Key: a.email
Sort Method: top-N heapsort Memory: 174kB
Buffers: shared hit=12865
-> Nested Loop Left Join (cost=1.12..5.60 rows=1 width=1245) (actual time=0.042..10.819 rows=2142 loops=1)
Buffers: shared hit=12865
-> Index Scan using idx_mmmcfattlist_mmmuuid_f on mtgxxxxxxxx a (cost=0.56..2.79 rows=1 width=1169) (actual time=0.025..2.492
rows=2142 loops=1)
Index Cond: ((sssid = $1) AND ((mmmuuid)::text = ($2)::text))
Filter: ((joinstatus = ANY (ARRAY[$3, $4])) OR ((usertype)::text = 'Testlist'::text))
Buffers: shared hit=2891
-> Index Scan using idx_mtgattndlstext_mmmuuid_uid on mtgxxxxxxxext b (cost=0.56..2.79 rows=1 width=133) (actual time=0.003..0
.003 rows=1 loops=2142)
Index Cond: ((sssid = a.sssid) AND (sssid = $1) AND ((mmmuuid)::text = (a.mmmuuid)::text) AND ((mmmuuid)::text = ($2)::text) AND (uui
d = a.uuid))
Buffers: shared hit=10710 <<< here much less shared hits , and Index Cond automatically added sssid = a.sssid ,mmuuid=a.mmmuuid.
Planning Time: 0.021 ms
Execution Time: 12.451 ms
(17 rows)
FROM
mtgxxxxxxxx a LEFT OUTER JOIN mtgxxxxxxxext b ON a.sssid = b.sssid and a.MMMUUID = b.MMMUUID and a.uuid = b.uuid
WHERE
a.SSSID=$1
AND a.MMMUID=$2
ORDER BY a.XXXX asc
offset 300 rows
FETCH FIRST 51 ROWS ONLY
explain (analyze,buffers) slowsql1(...)
1. with default, join filters just after nestloop,
Limit (cost=5.61..5.62 rows=1 width=1169) (actual time=2249.443..2249.454 rows=51 loops=1)
Buffers: shared hit=3864917
-> Sort (cost=5.61..5.61 rows=1 width=1169) (actual time=2249.404..2249.438 rows=351 loops=1)
Sort Key: a.email
Sort Method: top-N heapsort Memory: 174kB
Buffers: shared hit=3864917
-> Nested Loop Left Join (cost=1.12..5.60 rows=1 width=1169) (actual time=1.335..2246.971 rows=2142 loops=1)
Join Filter: ((a.sssid = b.sssid) AND ((a.mmmuuid)::text = (b.mmmuuid)::text) AND (a.uuid = b.uuid))
Rows Removed by Join Filter: 4586022
Buffers: shared hit=3864917
-> Index Scan using idx_mmmcfattlist_mmmuuid_f on mtgxxxxxxxx a (cost=0.56..2.79 rows=1 width=1093) (actual time=0.026..5.318 rows
=2142 loops=1)
Index Cond: ((sssid = $1) AND ((mmmuuid)::text = ($2)::text))
Filter: ((joinstatus = ANY (ARRAY[$3, $4])) OR ((usertype)::text = 'Testlist'::text))
Buffers: shared hit=2891
-> Index Scan using idx_mtgattndlstext_mmmuuid_uid on mtgxxxxxxxext b (cost=0.56..2.78 rows=1 width=133) (actual time=0.016..0.698
rows=2142 loops=2142)
Index Cond: ((sssid = $1) AND ((mmmuuid)::text = ($2)::text))
Buffers: shared hit=3862026 <<< here huge shared hits.
Planning Time: 0.033 ms
Execution Time: 2249.527 ms
create statistics mtgxxxxxxext_sssid_mmmuuid(dependencies,ndistinct) on sssid, mmmuuid from mtgxxxxxxxext.
analyze mtgxxxxxxxext.
2. join filters pushed down to secondary index scan, and reduce a lot of shared blks access.
-------------------------------------------------------------------------------------------------------------------------------------------------------------
------------------------
Limit (cost=5.61..5.62 rows=1 width=1245) (actual time=12.371..12.380 rows=51 loops=1)
Buffers: shared hit=12865
-> Sort (cost=5.61..5.61 rows=1 width=1245) (actual time=12.333..12.364 rows=351 loops=1)
Sort Key: a.email
Sort Method: top-N heapsort Memory: 174kB
Buffers: shared hit=12865
-> Nested Loop Left Join (cost=1.12..5.60 rows=1 width=1245) (actual time=0.042..10.819 rows=2142 loops=1)
Buffers: shared hit=12865
-> Index Scan using idx_mmmcfattlist_mmmuuid_f on mtgxxxxxxxx a (cost=0.56..2.79 rows=1 width=1169) (actual time=0.025..2.492
rows=2142 loops=1)
Index Cond: ((sssid = $1) AND ((mmmuuid)::text = ($2)::text))
Filter: ((joinstatus = ANY (ARRAY[$3, $4])) OR ((usertype)::text = 'Testlist'::text))
Buffers: shared hit=2891
-> Index Scan using idx_mtgattndlstext_mmmuuid_uid on mtgxxxxxxxext b (cost=0.56..2.79 rows=1 width=133) (actual time=0.003..0
.003 rows=1 loops=2142)
Index Cond: ((sssid = a.sssid) AND (sssid = $1) AND ((mmmuuid)::text = (a.mmmuuid)::text) AND ((mmmuuid)::text = ($2)::text) AND (uui
d = a.uuid))
Buffers: shared hit=10710 <<< here much less shared hits , and Index Cond automatically added sssid = a.sssid ,mmuuid=a.mmmuuid.
Planning Time: 0.021 ms
Execution Time: 12.451 ms
(17 rows)
Thanks,
James
Re: Fwd: extend statistics help reduce index scan a lot of shared buffer hits.
От
Tomas Vondra
Дата:
On 2/27/24 14:58, James Pang wrote: > Postgresql 14.8, Redhat8. looks like have to create extend statistics > on indexed and joined columns to make join filters pushed down to secondary > index scan in nestloop, and the shared buffer hits show big difference. > > is it expected ? > It's hard to say what exactly is happening in the example query (I'd have to do some debugging, but that's impossible without a reproducer), but I think it's mostly expected. My guess is that without the stats the optimizer sees this: -> Index Scan using idx_mtgattndlstext_mmmuuid_uid on mtgxxxxxxxext b (cost=0.56..2.78 rows=1 width=133) (actual time=0.016..0.698 rows=2142 loops=2142) and so decides there's no point in pushing down more conditions to the index scan (because it already returns just 1 row). But there's some sort of correlation / skew, and it returns 2142 rows. With the extended stats it realizes pushing down more conditions makes sense, because doing that in index scan is cheaper than having to read the heap pages etc. So it does that. So yeah, this seems exactly the improvement you'd expect from extended stats. Why do you think this would not be expected? regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company