query optimization: aggregate and distinct
От | Jeff Davis |
---|---|
Тема | query optimization: aggregate and distinct |
Дата | |
Msg-id | 200308201626.26310.jdavis-pgsql@empires.org обсуждение исходный текст |
Ответы |
Re: query optimization: aggregate and distinct
Re: query optimization: aggregate and distinct |
Список | pgsql-general |
I have below a simplified version of what I'm trying to do. Basically, I am trying to get both an aggregate (an average) and "most recent" value. g | v | ts ---+----+---------------------------- 1 | 10 | 2003-08-20 16:00:27.010769 1 | 20 | 2003-08-20 16:00:30.380476 2 | 40 | 2003-08-20 16:00:37.399717 2 | 80 | 2003-08-20 16:00:40.265717 I would like, as output, something like this: g | v | avg | ts ---+----+--------------------+---------------------------- 1 | 20 | 15.000000000000000 | 2003-08-20 16:00:30.380476 2 | 80 | 60.000000000000000 | 2003-08-20 16:00:40.265717 which I got by a query like: SELECT t2.g,t2.v,t1.avg,t2.ts FROM (SELECT g,avg(v) FROM t GROUP BY g ) t1, (SELECT DISTINCT ON (g) * FROM t ORDER BY g,ts DESC ) t2 WHERE t1.g = t2.g; That produces the results that I need, but it seems inefficient to join a table with itself like that. My real query (not this simplified example) takes 5+ seconds and I suspect this join is why. Is there a better way? For my real query, it's using index scans where I'd expect, and I frequently VACUUM ANALYZE the big table and I have all the stats turned on. Also, I have more shared buffers than needed to put everything in RAM. Right now I'm using 7.2.1. Any improvements in 7.3 or 7.4 that would help this issue? Regards, Jeff Davis
В списке pgsql-general по дате отправления: