Re: Combining Aggregates

Поиск
Список
Период
Сортировка
От David Rowley
Тема Re: Combining Aggregates
Дата
Msg-id CAKJS1f8AtOiUaQb1ipz--K5G1aXH4+GUEsttAYS9GpKhk2_9tA@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Combining Aggregates  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Ответы Re: Combining Aggregates  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Re: Combining Aggregates  (Robert Haas <robertmhaas@gmail.com>)
Re: Combining Aggregates  (David Rowley <david.rowley@2ndquadrant.com>)
Список pgsql-hackers
On 17 March 2016 at 14:25, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
> On 03/16/2016 12:03 PM, David Rowley wrote:
>> Patch 2:
>> This adds the serial/deserial aggregate infrastructure, pg_dump
>> support, CREATE AGGREGATE changes, and nodeAgg.c changes to have it
>> serialise and deserialise aggregate states when instructed to do so.
>>
>> Patch 3:
>> This adds a boat load of serial/deserial functions, and combine
>> functions for most of the built-in numerical aggregate functions. It
>> also contains some regression tests which should really be in patch 2,
>> but I with patch 2 there's no suitable serialisation or
>> de-serialisation functions to test CREATE AGGREGATE with. I think
>> having them here is ok, as patch 2 is quite useless without patch 3
>> anyway.
>
> I don't see how you could move the tests into #2 when the functions are
> defined in #3? IMHO this is the right place for the regression tests.

Yeah, but the CREATE AGGREGATE changes which are being tested in 0003
were actually added in 0002. I think 0002 is the right place to test
the changes to CREATE AGGREGATE, but since there's a complete lack of
functions to use, then I've just delayed until 0003.

>> Another thing to note about this patch is that I've gone and created
>> serial/de-serial functions for when PolyNumAggState both require
>> sumX2, and don't require sumX2. I had thought about perhaps putting an
>> extra byte in the serial format to indicate if a sumX2 is included,
>> but I ended up not doing it this way. I don't really want these serial
>> formats getting too complex as we might like to do fun things like
>> pass them along to sharded servers one day, so it might be nice to
>> keep them simple.
>
>
> Hmmm, I've noticed that while eyeballing the diffs, and I'm not sure if it's
> worth the additional complexity at this  point. I mean, one byte is hardly
> going to make a measurable difference - we're probably wasting more than
> that due to alignment, for example.

I don't think any alignment gets done here. Look at pq_sendint().
Did you mean the complexity of having extra functions, or having the
extra byte to check in the deserial func?

> As you've mentioned sharded servers - how stable should the serialized
> format be? I've assumed it to be transient, i.e. in the extreme case it
> might change after restarting a database - for the parallel aggregate that's
> clearly sufficient.
>
> But if we expect this to eventually work in a sharded environment, that's
> going to be way more complicated. I guess in these cases we could rely on
> implicit format versioning, via the major the version (doubt we'd tweak the
> format in a minor version anyway).
>
> I'm not sure the sharding is particularly useful argument at this point,
> considering we don't really know if the current format is actually a good
> match for that.

Perhaps you're right. At this stage I've no idea if we'd want to
support shards on varying major versions. I think probably not, since
the node write functions might not be compatible with the node read
functions on the other server.

I've attached another series of patches:

0001: This is the latest Parallel Aggregate Patch, not intended for
review here, but is required for the remaining patches. This patch has
changed quite a bit from the previous one that I posted here, and the
remaining patches needed re-based due to those changes.

0002: Adds serial/de-serial function support to CREATE AGGREGATE,
contains minor fix-ups from last version.

0003: Adds various combine/serial/de-serial functions for the standard
set of aggregates, apart from most float8 aggregates.

0004: Adds regression tests for 0003 pg_aggregate.h changes.

0005: Documents to mention which aggregate functions support partial mode.

0006: Re-based version of Haribabu's floating point aggregate support,
containing some fixes by me.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Вложения

В списке pgsql-hackers по дате отправления:

Предыдущее
От: David Rowley
Дата:
Сообщение: Re: Combining Aggregates
Следующее
От: Michael Paquier
Дата:
Сообщение: Re: VS 2015 support in src/tools/msvc