Re: index for group by
От | Steve Midgley |
---|---|
Тема | Re: index for group by |
Дата | |
Msg-id | 20080722211730.865606502BA@postgresql.org обсуждение исходный текст |
Ответ на | index for group by (Patrick Scharrenberg <pittipatti@web.de>) |
Список | pgsql-sql |
At 09:20 AM 7/22/2008, pgsql-sql-owner@postgresql.org wrote: >Date: Tue, 22 Jul 2008 13:27:24 +0200 >From: "A. Kretschmer" <andreas.kretschmer@schollglas.com> >To: pgsql-sql@postgresql.org >Subject: Re: index for group by >Message-ID: <20080722112724.GD2742@a-kretschmer.de> > >am Tue, dem 22.07.2008, um 13:18:30 +0200 mailte Patrick Scharrenberg >folgendes: > > Hi, > > > > is there a way to speedup "group by" queries with an index? > > > > In particular if I have a table like this: > > > > CREATE TABLE data > > ( > > id1 integer, > > id2 integer, > > somedata character varying, > > ts timestamp with time zone > > ); > > > > where continously data is logged about "id1" and "id2" into > "somedata", > > together with the timestamp when it was logged. > > > > So I have multiple rows with the same id1 and id2 but different > > timestamp (and data maybe). > > > > At the moment I have ~40.000.000 rows in that table so doing a > > > > SELECT id1, id2 FROM data GROUP BY id1, id2; > > >without a where-clause every select forces a seq-scan. First, why are you doing a group by when you aren't doing an aggregation (like COUNT, SUM, etc)? It seems like you can get way better performance by doing this: SELECT DISTINCT ON (id1, id2) id1, id2 FROM data ORDER BY id1, id2 (Assuming your compound index is in "id1,id2" order). Am I missing something? A different more cumbersome idea I have for you (if you really do need a GROUP BY) is to build a warehouse table that precalculates the data you want. You can build some recurring process that runs every NN minutes or hours and fires off a stored procedure which grabs all the data from this "data" table, aggregates it and saves it to warehouse table. You could aggregate against your datetime stamp by N hours or days as well. If this idea is of interest you can write back to the list or off-list to me for more info. Steve
В списке pgsql-sql по дате отправления: