Re: [HACKERS] aggregation memory leak and fix
От | Tom Lane |
---|---|
Тема | Re: [HACKERS] aggregation memory leak and fix |
Дата | |
Msg-id | 28066.921948538@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: [HACKERS] aggregation memory leak and fix (Bruce Momjian <maillist@candle.pha.pa.us>) |
Ответы |
Re: [HACKERS] aggregation memory leak and fix
|
Список | pgsql-hackers |
Bruce Momjian <maillist@candle.pha.pa.us> writes: > My only quick solution would seem to be to add a new "expression" memory > context, that can be cleared after every tuple is processed, clearing > out temporary values allocated inside an expression. Right, this whole problem of growing backend memory use during a large SELECT (or COPY, or probably a few other things) is one of the things that we were talking about addressing by revising the memory management structure. I think what we want inside the executor is a distinction between storage that must live to the end of the statement and storage that is only needed while processing the current tuple. The second kind of storage would go into a separate context that gets flushed every so often. (It could be every tuple, or every dozen or hundred tuples depending on what seems the best tradeoff of cycles against memory usage.) I'm not sure that just two contexts is enough, either. For example inSELECT field1, SUM(field2) GROUP BY field1; the working memory for the SUM aggregate could not be released after each tuple, but perhaps we don't want it to live for the whole statement either --- in that case we'd need a per-group context. (This particular example isn't very convincing, because the same storage for the SUM *could* be recycled from group to group. But I don't know whether it actually *is* reused or not. If fresh storage is palloc'd for each instantiation of SUM then we have a per-group leak in this scenario. In any case, I'm not sure all aggregate functions have constant memory requirements that would let them recycle storage across groups.) What we need to do is work out what the best set of memory context definitions is, and then decide on a strategy for making sure that lower-level routines allocate their return values in the right context. It'd be nice if the lower-level routines could still call palloc() and not have to worry about this explicitly --- otherwise we'll break not only a lot of our own code but perhaps a lot of user code. (User- specific data types and SPI code all use palloc, no?) I think it is too late to try to fix this for 6.5, but it ought to be a top priority for 6.6. regards, tom lane
В списке pgsql-hackers по дате отправления: