Re: Combining Aggregates
От | David Rowley |
---|---|
Тема | Re: Combining Aggregates |
Дата | |
Msg-id | CAKJS1f8AtOiUaQb1ipz--K5G1aXH4+GUEsttAYS9GpKhk2_9tA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Combining Aggregates (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
Ответы |
Re: Combining Aggregates
Re: Combining Aggregates Re: Combining Aggregates |
Список | pgsql-hackers |
On 17 March 2016 at 14:25, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > On 03/16/2016 12:03 PM, David Rowley wrote: >> Patch 2: >> This adds the serial/deserial aggregate infrastructure, pg_dump >> support, CREATE AGGREGATE changes, and nodeAgg.c changes to have it >> serialise and deserialise aggregate states when instructed to do so. >> >> Patch 3: >> This adds a boat load of serial/deserial functions, and combine >> functions for most of the built-in numerical aggregate functions. It >> also contains some regression tests which should really be in patch 2, >> but I with patch 2 there's no suitable serialisation or >> de-serialisation functions to test CREATE AGGREGATE with. I think >> having them here is ok, as patch 2 is quite useless without patch 3 >> anyway. > > I don't see how you could move the tests into #2 when the functions are > defined in #3? IMHO this is the right place for the regression tests. Yeah, but the CREATE AGGREGATE changes which are being tested in 0003 were actually added in 0002. I think 0002 is the right place to test the changes to CREATE AGGREGATE, but since there's a complete lack of functions to use, then I've just delayed until 0003. >> Another thing to note about this patch is that I've gone and created >> serial/de-serial functions for when PolyNumAggState both require >> sumX2, and don't require sumX2. I had thought about perhaps putting an >> extra byte in the serial format to indicate if a sumX2 is included, >> but I ended up not doing it this way. I don't really want these serial >> formats getting too complex as we might like to do fun things like >> pass them along to sharded servers one day, so it might be nice to >> keep them simple. > > > Hmmm, I've noticed that while eyeballing the diffs, and I'm not sure if it's > worth the additional complexity at this point. I mean, one byte is hardly > going to make a measurable difference - we're probably wasting more than > that due to alignment, for example. I don't think any alignment gets done here. Look at pq_sendint(). Did you mean the complexity of having extra functions, or having the extra byte to check in the deserial func? > As you've mentioned sharded servers - how stable should the serialized > format be? I've assumed it to be transient, i.e. in the extreme case it > might change after restarting a database - for the parallel aggregate that's > clearly sufficient. > > But if we expect this to eventually work in a sharded environment, that's > going to be way more complicated. I guess in these cases we could rely on > implicit format versioning, via the major the version (doubt we'd tweak the > format in a minor version anyway). > > I'm not sure the sharding is particularly useful argument at this point, > considering we don't really know if the current format is actually a good > match for that. Perhaps you're right. At this stage I've no idea if we'd want to support shards on varying major versions. I think probably not, since the node write functions might not be compatible with the node read functions on the other server. I've attached another series of patches: 0001: This is the latest Parallel Aggregate Patch, not intended for review here, but is required for the remaining patches. This patch has changed quite a bit from the previous one that I posted here, and the remaining patches needed re-based due to those changes. 0002: Adds serial/de-serial function support to CREATE AGGREGATE, contains minor fix-ups from last version. 0003: Adds various combine/serial/de-serial functions for the standard set of aggregates, apart from most float8 aggregates. 0004: Adds regression tests for 0003 pg_aggregate.h changes. 0005: Documents to mention which aggregate functions support partial mode. 0006: Re-based version of Haribabu's floating point aggregate support, containing some fixes by me. -- David Rowley http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Вложения
- 0001-Allow-aggregation-to-happen-in-parallel_2016-03-20.patch
- 0002-Allow-INTERNAL-state-aggregates-to-participate-in-pa_2016-03-20.patch
- 0003-Add-various-aggregate-combine-and-serialize-de-seria_2016-03-20.patch
- 0004-Add-sanity-regression-tests-for-new-aggregate-serial_2016-03-20.patch
- 0005-Add-documents-to-explain-which-aggregates-support-pa_2016-03-20.patch
- 0006-Add-combine-functions-for-various-floating-point-agg_2016-03-20.patch
В списке pgsql-hackers по дате отправления: