Fw: Selecting random rows using weights
От | Masaru Sugawara |
---|---|
Тема | Fw: Selecting random rows using weights |
Дата | |
Msg-id | 20020519132333.6BB4.RK73@sea.plala.or.jp обсуждение исходный текст |
Список | pgsql-novice |
On Sun, 12 May 2002 13:24:17 -0600 "G" <gerard@interfold.com> wrote: > > Is there a way to select rows randomly using weight columns? (weighted > random values) > I find out how to get what you want at random; however it's for only a row. If you need ROWS, it's impossible to select them by using these functions at least. Probably, another approach will be need. -- First -- -- An id column must be unique. create table tbl_random(id int4 unique not null, weight int4); insert into tbl_random values(1, 1); insert into tbl_random values(2, 10); insert into tbl_random values(3, 0); insert into tbl_random values(4, 3); ALTER TABLE tbl_random ADD COLUMN r_start int4; ALTER TABLE tbl_random ADD COLUMN r_end int4; CREATE INDEX idx_random_r_start ON tbl_random(r_start); CREATE INDEX idx_random_r_end ON tbl_random(r_end); -- Second -- CREATE OR REPLACE FUNCTION fn_update_random() RETURNS boolean AS ' DECLARE rec RECORD; range int4 :=1; BEGIN FOR rec IN SELECT * FROM tbl_random WHERE weight > 0 LOOP UPDATE tbl_random SET r_start = range, r_end = weight + range - 1 WHERE id = rec.id; SELECT INTO range (r_end + 1) FROM tbl_random WHERE id = rec.id; END LOOP; FOR rec IN SELECT * FROM tbl_random WHERE weight = 0 LOOP UPDATE tbl_random SET r_start = 0, r_end = 0 WHERE id = rec.id; END LOOP; RETURN true; END; ' LANGUAGE 'plpgsql'; -- Third -- -- This query for updating the weighted range of r_start to r_end -- needs to execute after rows are inserted into a target table, -- ones are deleted from it, or the "weight" column in it is updated. SELECT fn_update_random(); -- 4th -- -- A random number(0 to 1) is scaled by the maximum number of r_end, which -- must be put into the subselect not to internally execute the random() -- function twice. And if its scaled number is within the weighted range, -- the unique row having the range will be selected. SELECT t1.*, t2.r FROM tbl_random AS t1, (SELECT ceil(max(r_end) * (SELECT random())) AS r FROM tbl_random) AS t2 WHERE t1.r_start <= t2.r AND t2.r <= t1.r_end; -- Here is a bad example. SELECT t1.* FROM tbl_random AS t1 WHERE (SELECT ceil(max(r_end) * (SELECT random())) FROM tbl_random) BETWEEN t1.r_start AND t1.r_end; Regards, Masaru Sugawara
В списке pgsql-novice по дате отправления: