Re: PATCH: adaptive ndistinct estimator v3 (WAS: Re: [PERFORM] Yet another abort-early plan disaster on 9.3)
От | Heikki Linnakangas |
---|---|
Тема | Re: PATCH: adaptive ndistinct estimator v3 (WAS: Re: [PERFORM] Yet another abort-early plan disaster on 9.3) |
Дата | |
Msg-id | 549943DA.4090603@vmware.com обсуждение исходный текст |
Ответ на | PATCH: adaptive ndistinct estimator v3 (WAS: Re: [PERFORM] Yet another abort-early plan disaster on 9.3) (Tomas Vondra <tv@fuzzy.cz>) |
Ответы |
Re: PATCH: adaptive ndistinct estimator v3 (WAS: Re: [PERFORM]
Yet another abort-early plan disaster on 9.3)
|
Список | pgsql-hackers |
On 12/07/2014 03:54 AM, Tomas Vondra wrote: > The one interesting case is the 'step skew' with statistics_target=10, > i.e. estimates based on mere 3000 rows. In that case, the adaptive > estimator significantly overestimates: > > values current adaptive > ------------------------------ > 106 99 107 > 106 8 6449190 > 1006 38 6449190 > 10006 327 42441 > > I don't know why I didn't get these errors in the previous runs, because > when I repeat the tests with the old patches I get similar results with > a 'good' result from time to time. Apparently I had a lucky day back > then :-/ > > I've been messing with the code for a few hours, and I haven't found any > significant error in the implementation, so it seems that the estimator > does not perform terribly well for very small samples (in this case it's > 3000 rows out of 10.000.000 (i.e. ~0.03%). The paper [1] gives an equation for an upper bound of the error of this GEE estimator. How do the above numbers compare with that bound? [1] http://ftp.cse.buffalo.edu/users/azhang/disc/disc01/cd1/out/papers/pods/towardsestimatimosur.pdf - Heikki
В списке pgsql-hackers по дате отправления: