Re: ORDER BY random() LIMIT 1 slowness

Поиск

Список

Период

Сортировка

От	Tom Lane
Тема	Re: ORDER BY random() LIMIT 1 slowness
Дата	17 декабря 2002 г. 00:44:12
Msg-id	4144.1040103853@sss.pgh.pa.us обсуждение исходный текст
Ответ на	ORDER BY random() LIMIT 1 slowness ("Gavin M. Roy" <gmr@justsportsusa.com>)
Ответы	Re: ORDER BY random() LIMIT 1 slowness
Список	pgsql-general

Дерево обсуждения

"Gavin M. Roy" <gmr@justsportsusa.com> writes:
> SELECT * FROM poetry ORDER BY random() LIMIT 1;
> [ is slow for 35000 rows ]

Yeah.  Basically this query is implemented as
  (a) select all 35000 rows of "poetry";
  (b) compute a random() value for each row;
  (c) sort by the random() values;
  (d) take the first row, discard the rest.

The good news: this gives you a pretty-durn-random selection.  The bad
news: you didn't really care about choosing a random ordering of the
other 34999 rows, but it computed one anyway.

This problem's been discussed before, but I've not seen any really
ideal answer.  SQL is not designed to offer unpredictable results ;-)

If you have a reasonably-uniformly-distributed index key on the rows,
you can try something like

    select * from poetry where key > (scale * random() + min)
    order by key limit 1;

(where scale and min are chosen to make the result cover the range of
index keys).  Given an index on "key" this should pick a result quickly
via an indexscan+limit plan.  However, if the distribution of keys isn't
real uniform then you won't get real random results --- keys just above
larger gaps will be selected with unfairly high probability.  You also
have to know the min and max keys accurately.

I recall some folks have suggested generating an actual random column
(ie, something initialized with "default random()" and suitably indexed)
but this seems like a lot of overhead ... you'd have to be mighty
concerned with the exactness of your random selection to put up with
that, I think.

There are some other suggestions in the archives, IIRC.

            regards, tom lane

В списке pgsql-general по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: ORDER BY random() LIMIT 1 slowness