Re: Selecting random rows efficiently
От | Tom Lane |
---|---|
Тема | Re: Selecting random rows efficiently |
Дата | |
Msg-id | 25757.1062256444@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: Selecting random rows efficiently (Tom Lane <tgl@sss.pgh.pa.us>) |
Список | pgsql-performance |
I said: > 3. Your query now looks like > SELECT * FROM table WHERE random_id >= random() > ORDER BY random_id LIMIT 1; Correction: the above won't give quite the right query because random() is marked as a volatile function. You can hide the random() call inside a user-defined function that you (misleadingly) mark stable, or you can just stick it into a sub-select: regression=# explain select * from foo WHERE random_id >= (select random()) regression-# ORDER BY random_id LIMIT 1; QUERY PLAN ------------------------------------------------------------------------- Limit (cost=0.01..0.15 rows=1 width=8) InitPlan -> Result (cost=0.00..0.01 rows=1 width=0) -> Index Scan using fooi on foo (cost=0.00..45.50 rows=334 width=8) Index Cond: (random_id >= $0) (5 rows) This technique is probably safer against future planner changes, however: regression=# create function oneshot_random() returns float8 as regression-# 'select random()' language sql stable; CREATE FUNCTION regression=# explain select * from foo WHERE random_id >= oneshot_random() regression-# ORDER BY random_id LIMIT 1; QUERY PLAN ------------------------------------------------------------------------- Limit (cost=0.00..0.14 rows=1 width=8) -> Index Scan using fooi on foo (cost=0.00..46.33 rows=334 width=8) Index Cond: (random_id >= oneshot_random()) (3 rows) The point here is that an indexscan boundary condition has to use stable or immutable functions. By marking oneshot_random() stable, you essentially say that it's okay to evaluate it only once per query, rather than once at each row. regards, tom lane
В списке pgsql-performance по дате отправления: