Re: random() generates collisions too early
От | Honza Horak |
---|---|
Тема | Re: random() generates collisions too early |
Дата | |
Msg-id | 5266726E.5030300@redhat.com обсуждение исходный текст |
Ответ на | Re: random() generates collisions too early (Tom Lane <tgl@sss.pgh.pa.us>) |
Ответы |
Re: random() generates collisions too early
|
Список | pgsql-bugs |
On 10/21/2013 05:14 PM, Tom Lane wrote: > Heikki Linnakangas <hlinnakangas@vmware.com> writes: >> On 18.10.2013 14:55, Honza Horak wrote: >>> The results covered only 181383 distinct values, and 68 values >>> repeated four or five times each. We should at least consider using a >>> higher-entropy seed. > >> Interesting. PostgreSQL's random() function just calls the underlying >> libc random() function. I assume you tested this on with Linux and glibc. > > I agree with the theory that this probably isn't the fault of the random() > function as such, but with our code to reset the random seed when forking > a postmaster child process. Note that the test case is only examining the > first random value created in each process. So basically what this is > measuring is the number of different seed values we use. > > regards, tom lane > After playing a bit more with random() value and the seed defined in src/backend/postmaster/postmaster.c:4038 it seems really like the seed doesn't have very good characteristic. The PID number used there ensures the value to be different in child processes, but when only xor-ed with usecs, it doesn't enlarge the total data domain of the seed, which stays 1M values. Then having in mind the birthday paradox and probably not ideal PRNG, it doesn't seem unrealistic to get two same random numbers in only hundreds/thousands of tries. In order to enlarge possible data domain of the seed we can include sec count as well and shift some xor-ed variables to use the whole range or unsigned int. The attached patch uses a way that gave me much better results (collision found approx. after a hundred thousands of values): - srandom((unsigned int) (MyProcPid ^ usecs)); + srandom((unsigned int) (MyProcPid ^ (usecs << 12) ^ secs)); I'm also thinking of a reproducer -- does the test-suite allow to "reconnect" the client during a test? Regards, Honza
Вложения
В списке pgsql-bugs по дате отправления: