Alvaro Herrera <alvherre@2ndquadrant.com> writes:
> Tom Lane wrote:
>> Presumably what is happening is that the planner is switching from hash
>> to sort aggregation.
> I can't imagine that the server is avoiding hash aggregation on a 1MB
> work_mem limit for data that's a few dozen of bytes. Is it really doing
> that?
Yup:
regression=# explain SELECT v,h, string_agg(i::text, E'\n') AS i FROM ctv_data
GROUP BY v, h ORDER BY h,v; QUERY PLAN
------------------------------------------------------------------------Sort (cost=33.87..34.37 rows=200 width=96)
SortKey: h, v -> HashAggregate (cost=23.73..26.23 rows=200 width=96) Group Key: h, v -> Seq Scan on
ctv_data (cost=0.00..16.10 rows=610 width=68)
(5 rows)
regression=# set work_mem = '1MB';
SET
regression=# explain SELECT v,h, string_agg(i::text, E'\n') AS i FROM ctv_data
GROUP BY v, h ORDER BY h,v; QUERY PLAN
------------------------------------------------------------------------GroupAggregate (cost=44.32..55.97 rows=200
width=96) Group Key: h, v -> Sort (cost=44.32..45.85 rows=610 width=68) Sort Key: h, v -> Seq Scan on
ctv_data (cost=0.00..16.10 rows=610 width=68)
(5 rows)
Now that you mention it, this does seem a bit odd, although I remember
that there's a pretty substantial fudge factor in there when we have
no statistics (which we don't in this example). If I ANALYZE ctv_data
then it sticks to the hashagg plan all the way down to 64kB work_mem.
regards, tom lane