Re: [SQL] Search optimisation
От | Tom Lane |
---|---|
Тема | Re: [SQL] Search optimisation |
Дата | |
Msg-id | 17438.945241027@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Search optimisation (Olwen Williams <olwen@ihug.co.nz>) |
Список | pgsql-sql |
Olwen Williams <olwen@ihug.co.nz> writes: > [ why is query A so much faster than query B? ] Hmm. The system is correctly estimating that query A is much faster than B: > Nested Loop (cost=54.91 rows=657 width=340) ^^^^^^^^^^ vs > Hash Join (cost=11830.17 rows=981357 width=340) ^^^^^^^^^^^^^ so the question is why it thought this hashjoin plan was the best available option for query B. It looks to me like the problem has to do with the estimated selectivities of the two restrictions on the biblioitems table. Query A recognizes that biblionumber='109' selects just one item from biblioitems: -> Index Scan using bibitbnoidx on biblioitems (cost=2.03 rows=1 width=102) ^^^^^^ so it produces a good plan, even though a doubly nested loop would be an extremely horrible plan if there were a lot of rows involved. (Actually, this is the first complaint about bad plans I've ever seen in which doubly-nested-loop was the *right* choice ;-) ...) Meanwhile, query B thinks that isbn='031051911X' is going to accept several thousand rows from biblioitems: -> Index Scan using isbnidx on biblioitems (cost=203.35 rows=2988 width=102) ^^^^^^^^^ and because of that, it's choosing a plan that would be well adapted for a lot of matching rows, yet is overkill (read inefficient) for a few rows. Now, I'm going to read between the lines and guess that the restriction on isbn should yield only one biblioitems row; if this guess is wrong, you can ignore all that follows. If my guess is correct, then the blame lies with misestimation of the selectivity of the isbn='XXX' clause. You can double-check this by doing explain select * from biblioitems where isbn='031051911X'; and seeing whether the estimated row count is close to what you actually get from doing this query. The first question has to be "have you done a VACUUM ANALYZE lately?". If not, the planner is working in the dark about the distribution of isbn values, and you can't fairly blame it for guessing conservatively that it's going to have to deal with a lot of rows. However, if you have done a VACUUM ANALYZE and still get a ridiculous estimate, then that's a bug that I'd like to try to fix. Please contact me off-list and we can pursue the details of why this is happening. regards, tom lane
В списке pgsql-sql по дате отправления: