Re: multivariate statistics v14
От | Tatsuo Ishii |
---|---|
Тема | Re: multivariate statistics v14 |
Дата | |
Msg-id | 20160316.112907.1269707811749756579.t-ishii@sraoss.co.jp обсуждение исходный текст |
Ответ на | Re: multivariate statistics v14 (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
Список | pgsql-hackers |
> Instead of simply multiplying the ndistinct estimate with selecticity, > we instead use the formula for the expected number of distinct values > observed in 'k' rows when there are 'd' distinct values in the bin > > d * (1 - ((d - 1) / d)^k) > > This is 'with replacements' which seems appropriate for the use, and it > mostly assumes uniform distribution of the distinct values. So if the > distribution is not uniform (e.g. there are very frequent groups) this > may be less accurate than the current algorithm in some cases, giving > over-estimates. But that's probably better than OOM. > --- > src/backend/utils/adt/selfuncs.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c > index f8d39aa..6eceedf 100644 > --- a/src/backend/utils/adt/selfuncs.c > +++ b/src/backend/utils/adt/selfuncs.c > @@ -3466,7 +3466,7 @@ estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows, > /* > * Multiply by restriction selectivity. > */ > - reldistinct *= rel->rows / rel->tuples; > + reldistinct = reldistinct * (1 - powl((reldistinct - 1) / reldistinct,rel->rows)); Why do you change "*=" style? I see no reason to change this. reldistinct *= 1 - powl((reldistinct - 1) / reldistinct, rel->rows); Looks better to me because it's shorter and cleaner. Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp
В списке pgsql-hackers по дате отправления: