H. William Connors II wrote:
> fa_assignment has 44184945 records
> fa_assignment_detail has 82196027 records
>
> explain select * from fa_assignment fa JOIN fa_assignment_detail fad ON
> (fad.assignment_id = fa.assignment_id) where fa.scenario_id = 0;
>
> QUERY
> PLAN
> -------------------------------------------------------------------------------------------------------
>
> Hash Join (cost=581289.72..4940729.76 rows=9283104 width=91)
Are you really expecting 9 million rows in the result? If so, this is
probably a reasonable plan.
> Hash Cond: (fad.assignment_id = fa.assignment_id)
> -> Seq Scan on fa_assignment_detail fad (cost=0.00..1748663.60
> rows=82151360 width=61)
> -> Hash (cost=484697.74..484697.74 rows=4995439 width=30)
> -> Bitmap Heap Scan on fa_assignment fa
> (cost=93483.75..484697.74 rows=4995439 width=30)
> Recheck Cond: (scenario_id = 0)
> -> Bitmap Index Scan on fa_assignment_idx2
> (cost=0.00..92234.89 rows=4995439 width=0)
> Index Cond: (scenario_id = 0)
It's restricting on scenario_id, building a bitmap to identify which
disk-blocks will contain one or more matching rows and then scanning
those. If those 5 million scenario_id=0 rows are spread over 10% of the
blocks then that's a good idea.
If it was expecting only a handful of rows with scenario_id=0 then I'd
expect it to switch to a "standard" index scan.
If your work_mem is small try something like:
set work_mem = '50MB';
before running the query - maybe even larger.
--
Richard Huxton
Archonet Ltd