Re: Ordering of data on calls to user defined aggregate.
От | Tom Lane |
---|---|
Тема | Re: Ordering of data on calls to user defined aggregate. |
Дата | |
Msg-id | 24339.1021758973@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Ordering of data on calls to user defined aggregate. (Tim Hart <timjhart@shaw.ca>) |
Ответы |
Re: Ordering of data on calls to user defined aggregate.
|
Список | pgsql-general |
Tim Hart <timjhart@shaw.ca> writes: > Short version ( for the busy folk). > Is there any guarantee of the ordering of data on calls to a user > defined aggregate. (postgresql 7.2.1) None whatever. > So I tried a query like this: > select fk, concat_with_and(name) from ( select fk, name from foo order > by fk, name) sub_select group by fk; The reason this isn't very reliable can be seen by looking at what it does. In existing releases you have to grovel through EXPLAIN VERBOSE output to see what the sort keys are, but CVS tip is friendlier: srf=# explain srf-# select fk, my_agg(name) from ( select fk, name from foo order srf(# by fk, name) sub_select group by fk; QUERY PLAN -------------------------------------------------------------------------------------- Aggregate (cost=122.16..129.66 rows=100 width=40) -> Group (cost=122.16..127.16 rows=1000 width=40) -> Sort (cost=122.16..124.66 rows=1000 width=40) Sort Key: fk -> Subquery Scan sub_select (cost=69.83..72.33 rows=1000 width=40) -> Sort (cost=69.83..72.33 rows=1000 width=40) Sort Key: fk, name -> Seq Scan on foo (cost=0.00..20.00 rows=1000 width=40) (8 rows) The outer query has no idea that the inner query's output is already sorted, so it re-sorts ... using only the specified GROUP BY key. Unless your system's qsort is stable (which most are not), this will not preserve the by-name ordering within groups of the same fk value. If you weren't grouping at the outer level then the inner query's sort order would get the job done for you. In the presence of outer grouping I'm not sure there's a clean way to do it. I can think of a hack for the case where fk/name pairs are unique: select fk, my_agg(DISTINCT name) from foo group by fk; This relies on the assumption that the aggregate code will implement DISTINCT by means of sort/unique processing, which seems unlikely to break anytime soon. But it won't help if you want the aggregate to see multiple values of the same name for the same fk. There is some talk of reimplementing grouped aggregation using hash tables, which'd eliminate the upper SORT step and thereby give the behavior you want. I dunno how soon anyone will get around to it though. regards, tom lane
В списке pgsql-general по дате отправления: