Re: surprising results with random()
От | Steve Atkins |
---|---|
Тема | Re: surprising results with random() |
Дата | |
Msg-id | 5C99FB52-A5C2-47DF-864E-6A2800F06F9E@blighty.com обсуждение исходный текст |
Ответ на | surprising results with random() (Jessi Berkelhammer <jberkelhammer@desc.org>) |
Ответы |
Re: surprising results with random()
|
Список | pgsql-general |
On Feb 23, 2009, at 2:09 PM, Jessi Berkelhammer wrote: > Hi, > > I have a view in which I want to randomly assign values if certain > conditions hold. I was getting surprising results. Here is a (very) > simplified version of the view, which seems to indicate the problem: > > CREATE OR REPLACE VIEW test_view AS > SELECT > CASE > WHEN random() < .3333 THEN '1' > WHEN random() < .3333 THEN '2' > ELSE '3' > END AS test_value > > FROM client ; > > It seems this should generate a random number between 0 and 1, and set > test_value to '1' if this first generated number is less than .3333. > Otherwise, it should generate another random number, and set > test_value > to '2' if this is less than .3333. And if neither of the random > numbers > are less than .3333, it should set test_value to '3'. It seems to me > that there should be a relative even distribution of the 3 values. > > > However when I run this, the values are always similar to what is > below: > > X_test=> select test_value, count(*) from test_view group by 1 > order by 1; > test_value | count > ------------+------- > 1 | 23947 > 2 | 16061 > 3 | 32443 > > Why are there significantly fewer 2s? I understand that random() is > not > truly random, and that the seed affects this value. But it still > confuses me that, no matter how many times I run this, there are > always > so few 2s. If it is generating an independent random number in the > second call to random(), then I don't know why there are more so many > more 1s than 2s. Nope, it's nothing to do with random(), it's that your maths is wrong. There are 9 possible cases. In 3 of them you return 1. In 2 of them you return 2. In the remaining 4 cases you return 3. If you were to run this 72451 times I'd expect to see 1: 24150 = 72451 * 3/9 2: 16100 = 72451 * 2/9 3: 32200 = 72451 * 4/9 Which, unsurprisingly, is fairly close to what you get. Cheers, Steve
В списке pgsql-general по дате отправления: