Обсуждение: Conditional return of aggregated data
Hi,
I have some aggregated statistics which are currently being queried as follows:
create view stats_age as
SELECT a.category,
a.statcount,
b.a_desc
FROM reg_aggregated_stats a,regs_select_age b where a.category='age' and a.datapoint::smallint=b.a_val order by
a.datapointasc;
However, as these relate to event registrations, a suggestion has been made that the database really should be
returningnothing until a certain number of registrations has been met (in order to avoid privacy infringing inferrence
fromwhat should be an otherwise fully irreversibly aggregated dataset).
Now, the queries are wrapped in PL/PGSQL functions anyway, so I could make a second call to Postgres to find out
sum(statcount)and then conditionally return based on that.
But is there a smarter way to do this out of a single SQL query ?
My initial idea was something along the lines of :
select (select sum(statcount) from stats_residence) as aggstat,statcount,short_name_en from stats_residence where
aggstat>some_number;
But as I soon discovered that's not valid syntax! Hence ideas welcome from those smarter than me.
Thanks !
Laura
Hallo Laura,
i don't know if i understand your question fully,
but this might be helpfull?: FILTER
SELECT
count(*) AS unfiltered,
count(*) FILTER (WHERE i < 5) AS filtered
FROM generate_series(1,10) AS s(i);
unfiltered | filtered
------------+----------
10 | 4
(1 row)
https://www.postgresql.org/docs/current/sql-expressions.html
hth,
Wim
Laura Smith schreef op ma 02-12-2019 om 11:10 [+0000]:
> Hi,
>
> I have some aggregated statistics which are currently being queried
> as follows:
>
> create view stats_age as
> SELECT a.category,
> a.statcount,
> b.a_desc
> FROM reg_aggregated_stats a,regs_select_age b where a.category='age'
> and a.datapoint::smallint=b.a_val order by a.datapoint asc;
>
> However, as these relate to event registrations, a suggestion has
> been made that the database really should be returning nothing until
> a certain number of registrations has been met (in order to avoid
> privacy infringing inferrence from what should be an otherwise fully
> irreversibly aggregated dataset).
>
> Now, the queries are wrapped in PL/PGSQL functions anyway, so I could
> make a second call to Postgres to find out sum(statcount) and then
> conditionally return based on that.
>
> But is there a smarter way to do this out of a single SQL query ?
>
> My initial idea was something along the lines of :
> select (select sum(statcount) from stats_residence) as
> aggstat,statcount,short_name_en from stats_residence where
> aggstat>some_number;
>
> But as I soon discovered that's not valid syntax! Hence ideas welcome
> from those smarter than me.
>
> Thanks !
>
> Laura
>
>
--
mvg,
Wim Bertels
--
Lector
UC Leuven-Limburg
--
My only love sprung from my only hate!
Too early seen unknown, and known too late!
-- William Shakespeare, "Romeo and Juliet"
> My initial idea was something along the lines of : > select (select sum(statcount) from stats_residence) as aggstat,statcount,short_name_en from stats_residence where aggstat>some_number; Am I missing something basic. The above can be done using GROUP BY and HAVING clause.
>
> > My initial idea was something along the lines of :
> > select (select sum(statcount) from stats_residence) as aggstat,statcount,short_name_en from stats_residence where
aggstat>some_number;
>
> Am I missing something basic. The above can be done using
> GROUP BY and HAVING clause.
or this
with t as
(select (select sum(statcount) from stats_residence) as aggstat,
statcount,short_name_en from stats_residence
)
select * from t where aggstat > some_number
Apology if I did not understand the question correctly.
> > or this > with t as > (select (select sum(statcount) from stats_residence) as aggstat, > statcount,short_name_en from stats_residence > ) > select * from t where aggstat > some_number > > Apology if I did not understand the question correctly. Hi Ravi, Thanks for helping show me the way. You're quite right, a CTE did the trick Laura
On Mon, 2 Dec 2019 at 12:11, Laura Smith <n5d9xq3ti233xiyif2vp@protonmail.ch> wrote:
My initial idea was something along the lines of :
select (select sum(statcount) from stats_residence) as aggstat,statcount,short_name_en from stats_residence where aggstat>some_number;
One option is to move the aggregate to the where-clause. If you also need the value in your select-list, you can just repeat the subselect there, usually the planner is smart enough to figure out that it can just re-use the result.
--
If you can't see the forest for the trees,
Cut the trees and you'll see there is no forest.
Cut the trees and you'll see there is no forest.