Re: Incorrect choice of Nested Loop for a skewed distribution
От | Oleg Kharin |
---|---|
Тема | Re: Incorrect choice of Nested Loop for a skewed distribution |
Дата | |
Msg-id | a5fb2f9d-a8ac-2c8a-22fc-6d5f62f6d6b3@uvadrev.ru обсуждение исходный текст |
Ответ на | Re: Incorrect choice of Nested Loop for a skewed distribution (Justin Pryzby <pryzby@telsasoft.com>) |
Список | pgsql-performance |
Thank you Justin! On Wed, 04 Sep 2019 17:18:36 -0700 (PDT), Justin Pryzby wrote: > When you dropped the index, you necessarily refused the plan involving > index scan. > So it did merge join instead (which it thinks of as expensive because it has to > sort both sides). > > As you saw, the rowcount estimate of the join result is badly off. But that's > true of both plans. > > Choice of join type is affected by the size of its *inputs*, but doesn't and > shouldn't have any effect on the result rowcount (or its) estimate. The > rowcount *output* of the join would only affect any *following* plan nodes (of > which there are none in this case). > > You suggested using the MCV list, but I don't think that's possible, since the > nested loop is evaluating its "inner" table multiple times, in a "loop": > >> -> Index Scan using t_f on t (cost=0.29..0.36 rows=1 width=8) (actual time=0.565..1.125 rows=1 loops=20020) > Hypothetically, one could plan the query 20k times, for each value of u.f and > u.g, but that's tantamount to actually executing the query, which is why it > uses (I gather) var_eq_non_const. In fact the planner has information about the outer loop relation when it is estimating the number of inner loop rows for a nested-loop join. It ignores this information and considers the outer/inner loop relations as independent. So it uses for the rowcount estimate the number of distinct values only, not MCV lists of both tables. For the test case above, the planner estimates the selectivity of the clause "t.f=u.f" as 1/10011. It hopes to scan 1/10011*20020=2 rows in a inner loop for each row of the outer loop. In fact it scans 10010 rows 10010 times and 1 row 10010 times. The average number of rows scanned in a inner loop is (10010*10010+10010)/20020=5005. That is badly off from 2 rows expected. The planner should use 5005/20020 as a more accurate value for the effective selectivity of "t.f=u.f". I tried to make improvements to the function var_eq_non_const() so that it calculates the effective selectivity using MCV lists if possible. The patch for PostgreSQL 11.5 is attached to the letter. The patched PostgreSQL chooses an optimal plan for the test case. As a result the query execution time is reduced by more than 600 times. Now if we force the planner to choose the nested-loop join set enable_mergejoin=off; set enable_hashjoin=off; explain analyze select * from t,u where t.f=u.f and t.g=u.g; Nested Loop (cost=0.29..2261681.75 rows=25055030 width=16) (actual time=0.048..22519.232 rows=20020 loops=1) -> Seq Scan on u (cost=0.00..289.20 rows=20020 width=8) (actual time=0.012..2.970 rows=20020 loops=1) -> Index Scan using t_f on t (cost=0.29..100.44 rows=1252 width=8) (actual time=0.565..1.124 rows=1 loops=20020) Index Cond: (f = u.f) Filter: (u.g = g) Rows Removed by Filter: 5004 Planning Time: 0.188 ms Execution Time: 22521.248 ms we will see that the cost of the index scan is increased from 0.36 to 100.44, which is much more realistic and reflects 2 to 5005 rows ratio. Regards, Oleg
Вложения
В списке pgsql-performance по дате отправления: