Re: Select random lines of a table using a probability distribution
От | ktm@rice.edu |
---|---|
Тема | Re: Select random lines of a table using a probability distribution |
Дата | |
Msg-id | 20110713135810.GA1874@staff-mud-56-27.rice.edu обсуждение исходный текст |
Ответ на | Select random lines of a table using a probability distribution ("Jira, Marcel" <Marcel.Jira@wu.ac.at>) |
Список | pgsql-sql |
On Wed, Jul 13, 2011 at 03:27:10PM +0200, Jira, Marcel wrote: > Hi! > > Let's consider I have a table like this > > id qualification gender age income > > I'd like to select (for example 100) lines of this table by random, but the random mechanism has to follow a certain probabilitydistribution. > > I want to use this procedure to construct a test group for another selection. > > Example: > > I filter all lines having the qualification "plumber". > I get 50 different ids consisting of 40 males, 10 females and a certain age distribution. > > I also get some information concerning the income of the plumbers. > > Now I want to know if the income is more influenced by the gender and age distribution or by the qualification "plumber". > > Therefore I would like to select a test group (of 50 or more) without any plumbers. This test group has to follow the sameage and gender distribution. > > Then I would be able to compare this groups income statistics with the plumbers income statistics. > > Is this possible (and doable with reasonable effort) in PostgreSQL? > > Thank you in advance. > > Best regards, > > Marcel Jira > You may want to take a look at pl/R which make the R system available to PostgreSQL as a function language. Regards, Ken
В списке pgsql-sql по дате отправления: