Обсуждение: repeatet summary querys per month over 5 years

Поиск
Список
Период
Сортировка

repeatet summary querys per month over 5 years

От
Дата:
Hi,
I'm using PostgreSQL for a very short time now in combination with Squirrel and I'm very happy as the performance is very good (I use huge Datasets). But now I have a problem:

Background: I have to summarize (count) the number of animals for the categories production type, ageclass and sex per month (1. of every month) over 5 years.

I could produce the query for one month:

SELECT
"AUFENTHALTE"."nuar",
CASE WHEN DATE ('2007-01-01')- DATE ("AUFENTHALTE"."gebdat") < 365 THEN '1' WHEN DATE ('2007-01-01')- DATE ("AUFENTHALTE"."gebdat") > 730 THEN 3 ELSE 2 END AS AK,
CASE WHEN "AUFENTHALTE"."isweiblich" = 'T' THEN 'female' ELSE 'male' END AS sex,
COUNT("AUFENTHALTE"."tierid")
FROM "TVD_db"."AUFENTHALTE"
WHERE DATE("AUFENTHALTE"."gueltigvon") <= DATE('2007-01-01')
AND DATE("AUFENTHALTE"."gueltigbis") >= DATE('2007-01-01')
GROUP BY "AUFENTHALTE"."nuar",
CASE WHEN DATE ('2007-01-01')- DATE ("AUFENTHALTE"."gebdat") < 365 THEN '1' WHEN DATE ('2007-01-01')- DATE ("AUFENTHALTE"."gebdat") > 730 THEN 3 ELSE 2 END,
CASE WHEN "AUFENTHALTE"."isweiblich" = 'T' THEN 'female' ELSE 'male' END
ORDER BY "AUFENTHALTE"."nuar",
CASE WHEN DATE ('2007-01-01')- DATE ("AUFENTHALTE"."gebdat") < 365 THEN '1' WHEN DATE ('2007-01-01')- DATE ("AUFENTHALTE"."gebdat") > 730 THEN 3 ELSE 2 END,
CASE WHEN "AUFENTHALTE"."isweiblich" = 'T' THEN 'female' ELSE 'male' END;


now I should repeat this for every month from january 07 to dez 11. Ideally I'd generate a table with a column for every month:

nuar    ak    sex jan07     feb07    ...     dez11

but I failed to find a way to do it. generating a table with the named columns was no problem, also fill in the first rows, but after the insert statement (e.g for feb07) attached the values below the values generated with the query for january07. now i tried to pack the above statement in a subquery with the intention to do so for every month:

INSERT INTO "TVD_db"."lebendetiere" (nuar,ak ,sex,jan07,feb07)
SELECT
"AUFENTHALTE"."nuar",
case when DATE ('2007-01-01')- DATE ("AUFENTHALTE"."gebdat") < 365 then '1' when DATE ('2007-01-01')- DATE ("AUFENTHALTE"."gebdat") > 730 then 3 else 2 end as AK,
case when "AUFENTHALTE"."isweiblich" = 'T' then 'female' else 'male' end as sex,
(SELECT COUNT("AUFENTHALTE"."tierid") FROM "TVD_db"."AUFENTHALTE" where DATE("AUFENTHALTE"."gueltigvon") <= DATE('2007-01-01') and DATE("AUFENTHALTE"."gueltigbis") >= DATE('2007-01-01') GROUP BY "AUFENTHALTE"."nuar", case when DATE ('2007-01-01')- DATE ("AUFENTHALTE"."gebdat") < 365 then '1' when DATE ('2007-01-01')- DATE ("AUFENTHALTE"."gebdat") > 730 then 3 else 2 end, case when "AUFENTHALTE"."isweiblich" = 'T' then 'female' else 'male' end order by "AUFENTHALTE"."nuar",
case when DATE ('2007-01-01')- DATE ("AUFENTHALTE"."gebdat") < 365 then '1' when DATE ('2007-01-01')- DATE ("AUFENTHALTE"."gebdat") > 730 then 3 else 2 end,
case when "AUFENTHALTE"."isweiblich" = 'T' then 'female' else 'male' end) as jan07,
(SELECT COUNT("AUFENTHALTE"."tierid") FROM "TVD_db"."AUFENTHALTE" where DATE("AUFENTHALTE"."gueltigvon") <= DATE('2007-02-01') and DATE("AUFENTHALTE"."gueltigbis") >= DATE('2007-02-01') GROUP BY "AUFENTHALTE"."nuar", case when DATE ('2007-02-01')- DATE ("AUFENTHALTE"."gebdat") < 365 then '1' when DATE ('2007-02-01')- DATE ("AUFENTHALTE"."gebdat") > 730 then 3 else 2 end, case when "AUFENTHALTE"."isweiblich" = 'T' then 'female' else 'male' end order by "AUFENTHALTE"."nuar",
case when DATE ('2007-02-01')- DATE ("AUFENTHALTE"."gebdat") < 365 then '1' when DATE ('2007-02-01')- DATE ("AUFENTHALTE"."gebdat") > 730 then 3 else 2 end,
case when "AUFENTHALTE"."isweiblich" = 'T' then 'female' else 'male' end) as feb07
FROM "TVD_db"."AUFENTHALTE";

but it gives me an error: subquery generates more than one line... I want it to generate more than one line but I don't know how to write the code so PostgreSQL accepts it.

Obviously it would be nicer to generate a loop, but I couldn't find out how the syntax works and how to set a date to beginn and ask him to move in steps of 1 month... is that even possible?

If there is no easy solution I'll put the results/month together by hand, but it would be nice to save me some work...

Thank you very much in advance for your answer!

Sincerely
Sara

Re: repeatet summary querys per month over 5 years

От
Rory Campbell-Lange
Дата:
On 15/03/12, sara.schaerrer@vetsuisse.unibe.ch (sara.schaerrer@vetsuisse.unibe.ch) wrote:
> Background: I have to summarize (count) the number of animals for the
> categories production type, ageclass and sex per month (1. of every
> month) over 5 years.

...

> Ideally I'd generate a table with a column for every month:
>
> nuar    ak    sex jan07     feb07    ...     dez11

You may need a query with an outer and inner part. The inner part is a
query that provides you with the base data which is summarised in the
outer part.

I couldn't read your query but if something like this worked:

    SELECT
        animal
        ,sex
        ,tochar(dater,'YYDD') as month
    FROM
        sourcetable;

you can nest the query to get the sort of result you seem to need, as
follows:

   SELECT
       x.animal
       ,x.sex
       ,sum(case when x.month='012001' then 1 else 0 end) as '012001'
       ,sum(case when x.month='022001' then 1 else 0 end) as '022001'
       ,sum(case when x.month='032001' then 1 else 0 end) as '032001'
       ...
   FROM (
        SELECT
            animal
            ,sex
            ,tochar(dater,'MMYY') as month
        FROM
            sourcetable;
        ) x
   GROUP BY
       x.animal
       ,x.sex;

--
Rory Campbell-Lange
rory@campbell-lange.net

Campbell-Lange Workshop
www.campbell-lange.net
0207 6311 555
3 Tottenham Street London W1T 2AF
Registered in England No. 04551928

Re: repeatet summary querys per month over 5 years

От
Bartosz Dmytrak
Дата:
Hi,
it looks like tablefunc module (extension) could help

this could generate crosstab to summarize Your data. There are few good examples of corsstab function which I think is a solution.

Regards,
Bartek


2012/3/15 Rory Campbell-Lange <rory@campbell-lange.net>
On 15/03/12, sara.schaerrer@vetsuisse.unibe.ch (sara.schaerrer@vetsuisse.unibe.ch) wrote:
> Background: I have to summarize (count) the number of animals for the
> categories production type, ageclass and sex per month (1. of every
> month) over 5 years.

...

> Ideally I'd generate a table with a column for every month:
>
> nuar    ak    sex jan07     feb07    ...     dez11

You may need a query with an outer and inner part. The inner part is a
query that provides you with the base data which is summarised in the
outer part.

I couldn't read your query but if something like this worked:

   SELECT
       animal
       ,sex
       ,tochar(dater,'YYDD') as month
   FROM
       sourcetable;

you can nest the query to get the sort of result you seem to need, as
follows:

  SELECT
      x.animal
      ,x.sex
      ,sum(case when x.month='012001' then 1 else 0 end) as '012001'
      ,sum(case when x.month='022001' then 1 else 0 end) as '022001'
      ,sum(case when x.month='032001' then 1 else 0 end) as '032001'
      ...
  FROM (
       SELECT
           animal
           ,sex
           ,tochar(dater,'MMYY') as month
       FROM
           sourcetable;
       ) x
  GROUP BY
      x.animal
      ,x.sex;

--
Rory Campbell-Lange
rory@campbell-lange.net

Campbell-Lange Workshop
www.campbell-lange.net
0207 6311 555
3 Tottenham Street London W1T 2AF
Registered in England No. 04551928

--
Sent via pgsql-novice mailing list (pgsql-novice@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-novice

Re: repeatet summary querys per month over 5 years

От
Дата:
Thanks a lot for your answer, it looks promising. But I still have a problem. I'll try to give you more details in amore comprehensive way.

The sourcetable contains informations about every stay of each animal on a specific farm:

animalID animalbirthday animalsex farmID farmproductiontype  beginn(Date) end(date)

to get the number of animals in different ageclasses per month, I have to calculate the age of the animals for every month (e.g. 2001-01-01 - birthday) and categorize the ageclasses and select the animals that are alive (i.e. stay on a farm) on the first for every month  (begin <= 2001-01-01 and end >= 2001-01-01).

To do so the query I got looks like

SELECT
prductiontype,
animalsex,
CASE WHEN (2007-01-01)-animalbirhtday < 365 THEN '1' WHEN (2007-01-01)-animalbirhtday > 730 THEN '3' ELSE '2' END AS ageclass,
COUNT(animalID),
FROM sourcetable
WHERE begin <= 2007-01-01 and end >= 2007-01-01
GROUP BY farmproductiontype, animalsex, ageclass
ORDER BY farmproductiontype, animalsex, ageclass

wich produces me a table like

farmrpoductiontype animalsex ageclasse count
1                           female       1            10000
1                           male          1            10000


etc.... now I have to do this 60 times with reference date (here 2007-01-01) as the only thing changing. I'd like to automatize that and just ad each time a column to the table above

farmrpoductiontype animalsex ageclasse jan07      feb07  ....
1                           female       1            10000    10003
1                           male          1            10000    9000


In R I'd use a loop with for i=1:60  with begindat = 2011-07-01  and steps=month etc but I don't know if one can generate loops in postgresql with a variable in the query changing (I think not, right?)

I see the way I could use your solution assigning categories per month (e.g. CASE WHEN begin <= 2007-01-01 and end >= 2007-01-01 THEN Jan07 WHEN begin <= 2007-02-01 and end >= 2007-02-01 THEN feb07 etc)  in the inner query but this would multiplie the table as the animals stay usualy longer than one month on a farm. As also the ageclasses change for every month I think it gets to complicatet...

In a book about SQL (Alan Beaulieu, Introduction to SQL in German) I found, that its possible to use subqueries to generate expressions like

SELECT
x
(SELECT y FROM tabley) y

FROM table x;

but in postgress I get the error statement that the subquery produces more than 1 row... (this would be the other obtion, create 60 subqueries to produce the final table in one step).

In the end I guess I loose less time doing a lot by hand than trying to understand the possibilities of SQL by learning by doing... but oviously if there is a solution I'm very happy to learn, especially as I have to generate other summaries over the 60 months, this is just the first one...

I hope I could explain better and again thanks a lot!

Sara



________________________________________
Von: Rory Campbell-Lange [rory@campbell-lange.net]
Gesendet: Donnerstag, 15. März 2012 19:17
An: Schaerrer, Sara (VETSUISSE)
Cc: pgsql-novice@postgresql.org
Betreff: Re: [NOVICE] repeatet summary querys per month over 5 years

On 15/03/12, sara.schaerrer@vetsuisse.unibe.ch (sara.schaerrer@vetsuisse.unibe.ch) wrote:
> Background: I have to summarize (count) the number of animals for the
> categories production type, ageclass and sex per month (1. of every
> month) over 5 years.

...

> Ideally I'd generate a table with a column for every month:
>
> nuar ak sex jan07 feb07 ... dez11

You may need a query with an outer and inner part. The inner part is a
query that provides you with the base data which is summarised in the
outer part.

I couldn't read your query but if something like this worked:

SELECT
animal
,sex
,tochar(dater,'YYDD') as month
FROM
sourcetable;

you can nest the query to get the sort of result you seem to need, as
follows:

SELECT
x.animal
,x.sex
,sum(case when x.month='012001' then 1 else 0 end) as '012001'
,sum(case when x.month='022001' then 1 else 0 end) as '022001'
,sum(case when x.month='032001' then 1 else 0 end) as '032001'
...
FROM (
SELECT
animal
,sex
,tochar(dater,'MMYY') as month
FROM
sourcetable;
) x
GROUP BY
x.animal
,x.sex;

--
Rory Campbell-Lange
rory@campbell-lange.net

Campbell-Lange Workshop
www.campbell-lange.net
0207 6311 555
3 Tottenham Street London W1T 2AF
Registered in England No. 04551928