Re: Slow query with a lot of data
От | Moritz Onken |
---|---|
Тема | Re: Slow query with a lot of data |
Дата | |
Msg-id | DDB96E10-947A-4AA8-90D4-770567E1F236@houseofdesign.de обсуждение исходный текст |
Ответ на | Re: Slow query with a lot of data (Moritz Onken <onken@houseofdesign.de>) |
Список | pgsql-performance |
Am 21.08.2008 um 09:04 schrieb Moritz Onken: > > Am 20.08.2008 um 20:28 schrieb Tom Lane: > >> "Scott Carey" <scott@richrelevance.com> writes: >>> The planner actually thinks there will only be 28704 rows returned >>> of width >>> 12. But it chooses to sort 53 million rows before aggregating. >>> Thats >>> either a bug or there's something else wrong here. That is the >>> wrong way >>> to aggregate those results no matter how much work_mem you have >>> unless I'm >>> completely missing something... >> >> That does look weird. What are the datatypes of the columns being >> grouped by? Maybe they're not hashable? >> >> Another forcing function that prevents use of HashAgg is DISTINCT >> aggregates, but you don't seem to have any in this query... >> >> regards, tom lane > > The datatypes are both integers. There is no DISTINCT in this query. > Thanks anyway! > insert into setup1 (select a."user", b.category, sum(1.0/b.cat_count)::float from result a, domain_categories b where a."domain" = b."domain" and b.depth < 4 and a.results > 100 group by a."user", b.category); This query inserted a total of 16,000,000 rows and, with work_mem set to 3000mb, took about 24 hours. Any more ideas to speed this up?
В списке pgsql-performance по дате отправления: