Re: Combining Aggregates
От | Robert Haas |
---|---|
Тема | Re: Combining Aggregates |
Дата | |
Msg-id | CA+TgmobRsLKbVu7Mh6VrPLYO+mV5jfsizBrp_eHqNNb6110_fg@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Combining Aggregates (David Rowley <david.rowley@2ndquadrant.com>) |
Ответы |
Re: Combining Aggregates
|
Список | pgsql-hackers |
On Sun, Jan 17, 2016 at 9:26 PM, David Rowley <david.rowley@2ndquadrant.com> wrote: > hmm, so wouldn't that mean that the transition function would need to (for > each input tuple): > > 1. Parse that StringInfo into tokens. > 2. Create a new aggregate state object. > 3. Populate the new aggregate state based on the tokenised StringInfo, this > would perhaps require that various *_in() functions are called on each > token. > 4. Add the new tuple to the aggregate state. > 5. Build a new StringInfo based on the aggregate state modified in 4. > > ? I don't really know what you mean by parse the StringInfo into tokens. The whole point of the expanded-object interface is to be able to keep things in an expanded internal form so that you *don't* have to repeatedly construct and deconstruct internal data structures. I worked up an example of this approach using string_agg(), which I attach here. This changes the transition type of string_agg() from internal to text. The same code would work for bytea_string_agg(), which would allow removal of some other code, but this patch doesn't do that, because the point of this is to elucidate the approach. In my tests, this seems to be slightly slower than what we're doing today; worst of all, it adds a handful of cycles to advance_transition_function() even when the aggregate is not an expanded object or, indeed, not even pass-by-reference. Some of this might be able to be fixed by a little massaging - in particular, DatumIsReadWriteExpandedObject() seems like it could be partly or entirely inlined, and maybe there's some other way to improve the coding here. Generally, I think finding a way to pass expanded objects through nodeAgg.c would be a good thing to pursue, if we can make it work. The immediate impetus for changing things this way would be that we wouldn't need to add a mechanism for serializing and deserializing internal functions just to pass around partial aggregates. But there's another advantage, too: right now, advance_transition_function() does a lot of data copying to move data from per-call context to the per-aggregate context. When an expanded object is in use, this can be skipped. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Вложения
В списке pgsql-hackers по дате отправления: