multiple sampling from tables and saving output
От | David Orme |
---|---|
Тема | multiple sampling from tables and saving output |
Дата | |
Msg-id | 4b417e094092311ce074dffc62e865ad@ic.ac.uk обсуждение исходный текст |
Ответы |
Re: multiple sampling from tables and saving output
|
Список | pgsql-novice |
Hi, I need to perform a statistical bootstrap on data held in a postgresql database and I was wondering if anyone could recommend strategies. The process I need to do is a loop of 1000 repetitions of the following: 1) select a random subset of the data from a table 2) save various summaries of the randomly selected data I can think of various external ways of doing this - my current plan is to use a shell script to resend the same set of instructions repeated times using 'psql -f instruction_set.sql' - but I was wondering if there was a canonical way of doing this within pgsql. I've had a bit of a look at procedural languages but I'm not sure which is best for handling this kind of process. I've included an example of the set of instructions I want to repeat. Cheers, David Orme [running psql 7.3.4 on RHEL 3] -- Select 1096 species subsets select grid_id, species_id into temp random_locs from possible_locations where species_id in (select rand_pick.species_id from (select species_id, random() as random_id from species order by random_id limit 1096) as rand_pick); -- set up tab delimited unaligned \a \f '\t' -- export summary table by grid_id \o curr_gridid.txt -- get a count by grid id including nulls select grid_id, cnt from behr_grid left join ( select grid_id, count(distinct(species_id)) as cnt from random_locs group by grid_id) as loc_count using (grid_id) order by grid_id;
В списке pgsql-novice по дате отправления: