Re: improving GROUP BY estimation
От | Alexander Korotkov |
---|---|
Тема | Re: improving GROUP BY estimation |
Дата | |
Msg-id | CAPpHfdtEiXqj52kuGMEnw6EShrzj-0tFYwwxPqzdFvxYfCr+tQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: improving GROUP BY estimation (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
Ответы |
Re: improving GROUP BY estimation
|
Список | pgsql-hackers |
Hi, Tomas!
I've assigned to review this patch.
I've checked version estimate-num-groups-v2.txt by Mark Dilger.
It applies to head cleanly, passes corrected regression tests.
About correlated/uncorrelated cases. I think now optimizer mostly assumes all distributions to be independent.
I think we should follow this assumption in this case also until we have fundamentally better option (like your multivariate statistics).
@@ -3438,9 +3438,9 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows,
reldistinct = clamp;
/*
- * Multiply by restriction selectivity.
+ * Estimate number of distinct values expected in given number of rows.
*/
- reldistinct *= rel->rows / rel->tuples;
+ reldistinct *= (1 - powl((reldistinct - 1) / reldistinct, rel->rows));
/*
* Update estimate of total distinct groups.
I think we need way more explanation in comments here (possibly with external links). For now, it looks like formula which appears from nowhere.
------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
В списке pgsql-hackers по дате отправления: