Re: CPU costs of random_zipfian in pgbench
От | Tomas Vondra |
---|---|
Тема | Re: CPU costs of random_zipfian in pgbench |
Дата | |
Msg-id | 1d1f5a64-82c1-5b89-dbb7-3d621198f01d@2ndquadrant.com обсуждение исходный текст |
Ответ на | Re: CPU costs of random_zipfian in pgbench (Ants Aasma <ants.aasma@eesti.ee>) |
Ответы |
Re: CPU costs of random_zipfian in pgbench
|
Список | pgsql-hackers |
On 2/22/19 11:22 AM, Ants Aasma wrote: > On Sun, Feb 17, 2019 at 10:52 AM Fabien COELHO <coelho@cri.ensmp.fr > <mailto:coelho@cri.ensmp.fr>> wrote: > > > I'm trying to use random_zipfian() for benchmarking of skewed data > sets, > > and I ran head-first into an issue with rather excessive CPU costs. > > [...] This happens because generalizedHarmonicNumber() does this: > > > > for (i = n; i > 1; i--) > > ans += pow(i, -s); > > > > where n happens to be 1000000000 (range passed to random_zipfian), so > > the loop takes quite a bit of time. > > If you find a better formula for the harmonic number, you are welcome > and probably get your name on it:-) > > > There are pretty good approximations for s > 1.0 using Riemann zeta > function and Euler derived a formula for the s = 1 case. > I believe that's what random_zipfian() already uses, because for s > 1.0 it refers to "Non-Uniform Random Variate Generation" by Luc Devroye, and the text references the zeta function. Also, I have not observed serious issues with the s > 1.0 case (despite the docs seem to suggest there may be some). > I also noticed that i is int in this function, but n is int64. That > seems like an oversight. > Indeed. > Regards, > Ants Aasma > > cheers -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
В списке pgsql-hackers по дате отправления: