Re: selecting random row values in postgres
От | Geoff Tolley |
---|---|
Тема | Re: selecting random row values in postgres |
Дата | |
Msg-id | 45DFB208.1010201@polimetrix.com обсуждение исходный текст |
Ответ на | Re: selecting random row values in postgres (Tommy Gildseth <tommy@gildseth.com>) |
Список | pgsql-sql |
Tommy Gildseth wrote: > Sumeet wrote: >> Thanks Buddy, really appreciate ur help on this >> >> ....problem solved... >> >> Is there any way this query can be optimized...i'm running it on a >> huge table with joins > > ORDER BY rand() is rather slow on large datasets, since the db has to > actually generate a random value for each row in the table, before being > able use it to sort by. Preferable ways to do this include f.ex: > SELECT max(id) FROM table; > SELECT * FROM table WHERE id > [rand value between 0 and max(id)] ORDER > BY id LIMIT 1; I'd have thought that in most cases the slowness would be due more to the sort than the cost of generating the random numbers. Your pair of queries neatly avoid doing the sort, but it does raise the question, does it matter if not all rows have the same chance of being picked? As an extreme example in the above, if there are id's 1-100 and id = 100000 in the table, the last one will almost always be picked. I wonder if the random aggregate that Josh Berkus wrote could be adapted to return a record type rather than a random value of a single column? Its big advantage is that it requires no sorting and all entries are equiprobable. Unfortunately I'm a bit short of time at this end of a Friday to do much but give you the URL: http://www.powerpostgresql.com/Random_Aggregate HTH, Geoff
В списке pgsql-sql по дате отправления: