Re: Performance tips
От | Tom Lane |
---|---|
Тема | Re: Performance tips |
Дата | |
Msg-id | 21703.1010688265@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Performance tips (Andrew Perrin <andrew_perrin@unc.edu>) |
Ответы |
Re: Performance tips
|
Список | pgsql-general |
Andrew Perrin <andrew_perrin@unc.edu> writes: > auth=# EXPLAIN select count(patternid) from patterns where patternid in > (select > auth(# o_patternid from letters, pattern_occurrences where > letters.letterid = > auth(# pattern_occurrences.o_letterid and letters.datecat in (1,2)); > NOTICE: QUERY PLAN: > Aggregate (cost=10770432787318.88..10770432787318.88 rows=1 width=4) > -> Seq Scan on patterns (cost=0.00..10770432756138.14 rows=12472297 > width=4) > SubPlan > -> Materialize (cost=863548.43..863548.43 rows=5749731 > width=12) > -> Hash Join (cost=1741.00..863548.43 rows=5749731 > width=12) > -> Seq Scan on pattern_occurrences > (cost=0.00..250248.56 rows=15287556 width=8) > -> Hash (cost=1729.67..1729.67 rows=4530 width=4) > -> Seq Scan on letters (cost=0.00..1729.67 > rows=4530 width=4) Well, it's materializing the subquery result, which is good, but are there really going to be 5.7M rows in the subquery result? If so, no wonder you're hurting: the IN is going to be scanning through that result for each row from the outer query, until it either finds a match or reaches the end. Can you reduce the size of the subquery result at all? (If the subquery as written produces a lot of duplicate o_patternids, then making it be a SELECT DISTINCT might help.) The long-term answer is probably that you need to convert the IN to some smarter form of join. One idea that comes to mind is select count(patternid) from patterns, (select distinct o_patternid from letters, pattern_occurrences where letters.letterid = pattern_occurrences.o_letterid and letters.datecat in (1,2)) AS ss where patternid = ss.o_patternid; Given the "select distinct" to ensure there are no duplicates in the subselect output, this should produce the same output as the original, I think, and it would give the planner a shot at using a merge or hash join to match up the pattern id values. Oh, BTW: you might also try kicking up sort_mem if you didn't already. regards, tom lane
В списке pgsql-general по дате отправления: