Re: Strategy for doing number-crunching
От | János Löbb |
---|---|
Тема | Re: Strategy for doing number-crunching |
Дата | |
Msg-id | 8B6378B2-E819-421B-BC46-2A2AA3DC9CB3@yale.edu обсуждение исходный текст |
Ответ на | Strategy for doing number-crunching (Matthew Foster <matthew.foster@noaa.gov>) |
Список | pgsql-novice |
On Jan 4, 2012, at 11:36 AM, Matthew Foster wrote: > We have a database with approximately 130M rows, and we need to produce statistics (e.g. mean, standard deviation, etc.)on the data. Right now, we're generating these stats via a single SELECT, and it is extremely slow...like it can takehours to return results. The queries are variable, so there is no way I can think of to "pre-calculate" the stats. > > Here is an example query... > > SELECT count(mae) AS count, to_char(avg(mae), 'FM990.999') AS mae, > to_char(avg(sqrt(rms)), 'FM990.999') AS rms, to_char(avg(bias), 'FM990.999') AS bias, > to_char(max(mae), 'FM990.999') AS mae_max, to_char(min(mae), 'FM990.999') AS mae_min, > to_char(max(bias), 'FM990.999') AS bias_max, to_char(min(bias), 'FM990.999') AS bias_min, > to_char(max(sqrt(rms)), 'FM990.999') AS rms_max, to_char(min(sqrt(rms)), 'FM990.999') AS rms_min, > to_char(stddev(sqrt(rms)), 'FM990.999') AS rms_stddev, > to_char(stddev(mae), 'FM990.999') AS mae_stddev, > to_char(stddev(bias), 'FM990.999') AS bias_stddev > FROM verify_data.verification_data > WHERE model_name='foo' > AND...several other conditions... > > As one might imagine, this query pounds the CPU relentlessly for hours. > > I'm thinking there must be a better approach for doing large amounts of calculations on the data. Any pointers would begreatly appreciated! > > Matt > Make sure you have to right indexes.
В списке pgsql-novice по дате отправления: