Re: Sharing aggregate states between different aggregate functions
От | Heikki Linnakangas |
---|---|
Тема | Re: Sharing aggregate states between different aggregate functions |
Дата | |
Msg-id | 55B4FBAF.7000408@iki.fi обсуждение исходный текст |
Ответ на | Re: Sharing aggregate states between different aggregate functions (David Rowley <david.rowley@2ndquadrant.com>) |
Ответы |
Re: Sharing aggregate states between different aggregate functions
|
Список | pgsql-hackers |
On 07/09/2015 12:44 PM, David Rowley wrote: > On 15 June 2015 at 12:05, David Rowley <david.rowley@2ndquadrant.com> wrote: > >> >> This basically allows an aggregate's state to be shared between other >> aggregate functions when both aggregate's transition functions (and a few >> other things) match >> There's quite a number of aggregates in our standard set which will >> benefit from this optimisation. >> > After compiling the original patch with another compiler, I noticed a > couple of warnings. > > The attached fixes these. I spent some time reviewing this. I refactored the ExecInitAgg code rather heavily to make it more readable (IMHO); see attached. What do you think? Did I break anything? Some comments: * In aggref_has_compatible_states(), you give up if aggtype or aggcollid differ. But those properties apply to the final function, so you were leaving some money on the table by disallowing state-sharing if they differ. * The filter and input expressions are initialized for every AggRef, before the deduplication logic kicks in. The AggrefExprState.aggfilter, aggdirectargs and args fields really belong to the AggStatePerAggState struct instead. This is not a new issue, but now that we have a convenient per-aggstate struct to put them in, let's do so. * There was a reference-after free bug in aggref_has_compatible_states; you cannot ReleaseSysCache and then continue pointing to the struct. * The code was a bit fuzzy on which parts of the per-aggstate are filled in at what time. Some of the fields were overwritten every time a match was found. With the same values, so no harm done, but I found it confusing. I refactored ExecInitAgg in the attached patch to clear that up. * There API of build_aggregate_fnexprs() was a bit strange now that some callers use it to only fill in the final function, some call it to fill both the transition functions and the final function. I split it to two separate functions. * I wonder if we should do this duplicate elimination at plan time. It's very fast, so I'm not worried about that right now, but you had grand plans to expand this so that an aggregate could optionally use one of many different kinds of state values. At that point, it certainly seems like a planning decision to decide which aggregates share state. I think we can leave it as it is for now, but it's something to perhaps consider later. BTW, the name of the AggStatePerAggStateData struct is pretty horrible. The repeated "AggState" feels awkward. Now that I've stared at the patch for a some time, it doesn't bother me anymore, but it took me quite a while to over that. I'm sure it will for others too. And it's not just that struct, the comments talk about "aggregate state", which could be confused to mean "AggState", but it actually means AggStatePerAggStateData. I don't have any great suggestions, but can you come up a better naming scheme? - Heikki
Вложения
В списке pgsql-hackers по дате отправления: