Re: [PATCH] Negative Transition Aggregate Functions (WIP)
От | David Rowley |
---|---|
Тема | Re: [PATCH] Negative Transition Aggregate Functions (WIP) |
Дата | |
Msg-id | CAApHDvqsks__ZE47JshFS5PXXpamWuF-_u=s-HwCYuW=LZj5mA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [PATCH] Negative Transition Aggregate Functions (WIP) (Heikki Linnakangas <hlinnakangas@vmware.com>) |
Список | pgsql-hackers |
<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Mon, Dec 16, 2013 at 9:39 PM, Heikki Linnakangas <spandir="ltr"><<a href="mailto:hlinnakangas@vmware.com" target="_blank">hlinnakangas@vmware.com</a>></span> wrote:<br/><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div class=""><divclass="h5"><br /></div></div> There's another technique we could use which doesn't need a negative transitionfunction, assuming the order you feed the values to the aggreate function doesn't matter: keep subtotals. For example,if the window first contains values 1, 2, 3, 4, you calculate 3 + 4 = 7, and then 1 + 2 + 7 = 10. Next, 1 leavesthe window, and 5 enters it. Now you calculate 2 + 7 + 5 = 14. By keeping the subtotal (3 + 4 = 7) around, you savedone addition compared to calculating 2 + 3 + 4 + 5 from scratch.<br /><br /> The negative transition function is a lotsimpler and faster for count(*) and integer operations, so we probably should implement that anyway. But the subtotalstechnique could be very useful for other data types.<span class=""><font color="#888888"><br /><br /> - Heikki<br/></font></span></blockquote></div><br /></div><div class="gmail_extra">That's quite interesting. I guess we wouldneed another flag in pg_aggregate to mark if the order of the tuples matters, string_agg would be an example of onethat would have to skip this.</div><div class="gmail_extra"><br /></div><div class="gmail_extra">At least for ROWS BETWEENCURRENT ROW AND UNBOUNDED FOLLOWING if the aggregate order did not matter than it would likely be quite efficientjust to aggregate from the bottom up and materialise the results at each tuple and store it in the tuple store,then just use that materialised value when that tuple is processed. This method won't work when the frame base is notfixed.</div><div class="gmail_extra"><br /></div><div class="gmail_extra">I had been thinking ahead on how to improveMIN and MAX cases too. I came up with something called "tuple store indexes" that could be build as binary searchtrees with a composite index on the tuple position and the aggregate's sort operator... Something similar to how thefollowing query could use an index on (id,value) to calculate max()</div><div class="gmail_extra">select max(value) fromtest where id between 1 and 100;<br /></div><div class="gmail_extra">It's certainly not something for this patch, butit was an idea I came up with which I think would be possible without adding any more columns to pg_aggregate.</div><divclass="gmail_extra"><br /></div><div class="gmail_extra">Regards</div><div class="gmail_extra"><br/></div><div class="gmail_extra">David Rowley</div><div class="gmail_extra"><br /></div></div>
В списке pgsql-hackers по дате отправления: