I was wondering if there was any way to improve the performance of this query:
SELECT station,array_agg(distinct(channel)) as channels FROM data GROUP BY station;
If you have tables of possible stations and channels (and if not, why not?), then an EXISTS query, something like
So that one ran in about 5 minutes as well - apparently the time it takes to scan the index, given the similarity of run times for each of the different queries:
https://explain.depesz.com/s/w46h
SELECT
stations.name,
array_agg(channels.channel)
FROM stations,channels
WHERE EXISTS (SELECT
FROM data
GROUP BY stations.name
It’s only when I add in the AND data.channels=channels.channel that the query time blows up to 5+ minutes. I personally don’t understand why there would be such a large difference between the two queries - something wrong with my indexes?
---
Israel Brewster
Software Engineer
Alaska Volcano Observatory
Geophysical Institute - UAF
2156 Koyukuk Drive
Fairbanks AK 99775-7320
Work: 907-474-5172
cell: 907-328-9145