Re: abnormally long time in performing a two-table join
От | Tom Lane |
---|---|
Тема | Re: abnormally long time in performing a two-table join |
Дата | |
Msg-id | 25080.1029109707@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: abnormally long time in performing a two-table join (Chris Mungall <cjm@fruitfly.org>) |
Ответы |
Re: abnormally long time in performing a two-table join
|
Список | pgsql-admin |
Chris Mungall <cjm@fruitfly.org> writes: > On Sun, 11 Aug 2002, Tom Lane wrote: >> Odd ... I'm surprised it doesn't choose a hash join. What do you get if >> you try it with "set enable_nestloop = off" ? > Much better! > omicia29=# set enable_nestloop = off; > SET VARIABLE > omicia29=# explain analyze select seqfeature_id from seqfeature NATURAL > JOIN sfqv where qualifier_value = 'BRCA1' and seqfeature.seqfeature_key_id > = 15; > NOTICE: QUERY PLAN: > Hash Join (cost=55921.31..223860.67 rows=778 width=8) (actual > time=4249.63..4259.47 rows=15 loops=1) > -> Index Scan using sfqv_idx3 on sfqv (cost=0.00..167411.84 rows=41423 > width=4) (actual time=38.05..44.50 rows=110 loops=1) > -> Hash (cost=51056.43..51056.43 rows=49453 width=4) (actual > time=4211.15..4211.15 rows=0 loops=1) > -> Seq Scan on seqfeature (cost=0.00..51056.43 rows=49453 > width=4) (actual time=0.14..3974.12 rows=100030 loops=1) > Total runtime: 4259.67 msec Hmm. It looks like the reason the planner doesn't like this plan is that it's vastly overestimating the number of rows it will get from indexscanning sfqv_idx3 for qualifier_value = 'BRCA1' (viz, 41423 rather than the true 110). This is a statistical failure. What do you get from select * from pg_stats where tablename = 'sfqv' ? You might try increasing the statistics target for the qualifier_value column --- I'm guessing that you need more resolution in the stats to deal correctly with low-probability data. Try ALTER TABLE sfqv ALTER COLUMN qualifier_value SET STATISTICS 100; ANALYZE sfqv; -- to recompute stats then see how the EXPLAIN results change. (The default stats target is 10; 100 might be more than you need, or perhaps not.) > however I'm not sure what the implications of turning nestloop off > altogether are - maybe i can hardcode it just for this query It'd be best not to. I'd counsel seeing if more stats help, first. regards, tom lane
В списке pgsql-admin по дате отправления: