Re: PATCH: decreasing memory needlessly consumed by array_agg
От | Tom Lane |
---|---|
Тема | Re: PATCH: decreasing memory needlessly consumed by array_agg |
Дата | |
Msg-id | 20707.1396372100@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: PATCH: decreasing memory needlessly consumed by array_agg (Tomas Vondra <tv@fuzzy.cz>) |
Ответы |
Re: PATCH: decreasing memory needlessly consumed by array_agg
Re: PATCH: decreasing memory needlessly consumed by array_agg |
Список | pgsql-hackers |
Tomas Vondra <tv@fuzzy.cz> writes: > I've been thinking about it a bit more and maybe the doubling is not > that bad idea, after all. It is not. There's a reason why that's our standard behavior. > The "current" array_agg however violates some of the assumptions > mentioned above, because it > (1) pre-allocates quite large number of items (64) at the beginning, > resulting in ~98% of memory being "wasted" initially > (2) allocates one memory context per group, with 8kB initial size, so > you're actually wasting ~99.999% of the memory > (3) thanks to the dedicated memory contexts, the doubling is pretty > much pointless up until you cross the 8kB boundary > IMNSHO these are the issues we really should fix - by lowering the > initial element count (64->4) and using a single memory context. The real issue here is that all those decisions are perfectly reasonable if you expect that a large number of values will get aggregated --- and even if you don't expect that, they're cheap insurance in simple cases. It only gets to be a problem if you have a lot of concurrent executions of array_agg, such as in a grouped-aggregate query. You're essentially arguing that in the grouped-aggregate case, it's better to optimize on the assumption that only a very small number of values will get aggregated (per hash table entry) --- which is possibly reasonable, but the argument that it's okay to pessimize the behavior for other cases seems pretty flimsy from here. Actually, though, the patch as given outright breaks things for both the grouped and ungrouped cases, because the aggregate no longer releases memory when it's done. That's going to result in memory bloat not savings, in any situation where the aggregate is executed repeatedly. I think a patch that stood a chance of getting committed would need to detect whether the aggregate was being called in simple or grouped contexts, and apply different behaviors in the two cases. And you can't just remove the sub-context without providing some substitute cleanup mechanism. Possibly you could keep the context but give it some much-more-miserly allocation parameters in the grouped case. regards, tom lane
В списке pgsql-hackers по дате отправления: