Re: Minor performance improvement in transition to external sort
От | Jeremy Harris |
---|---|
Тема | Re: Minor performance improvement in transition to external sort |
Дата | |
Msg-id | 52F408B5.7050102@wizmail.org обсуждение исходный текст |
Ответ на | Re: Minor performance improvement in transition to external sort (Jeff Janes <jeff.janes@gmail.com>) |
Ответы |
Re: Minor performance improvement in transition to external
sort
|
Список | pgsql-hackers |
On 06/02/14 18:21, Jeff Janes wrote: > How big of > sets were you sorting in each case? Big enough to go external. The timings and compare-counts given are purely for the heapify stage not the total for the sort, so are constrained by the work_mem not by the sort size per se. I'm limited to working on a small machine, so the work_mem value of 1e+6 is approaching my top limit of sort-time pain unless I rip the heapify out into a test harness. What range of work_mem is it useful to test over? > Was it always just slightly bigger > than would fit in work_mem? I didn't check. > Did you try sorting already-sorted, reverse > sorted, or pipe-organ shaped data sets? Not yet, but I can. What variety of pipe-organ shape is of interest (I can think of several that might be called that)? > We will also need to test it on > strings. I usually use md5(random()::text) to generate strings for such > purposes, at least for a first pass. OK. > > The name of the attached patch is a bit unfortunate. Is there a project standard for this? > And I'm not sure what > you are doing with tupindex, the change there doesn't seem to be necessary > to the rest of your work, so some explanation on that would be helpful. I'll add commentary. > The project coding style usually has comments in blocks before loops on > which they comment, rather than interspersed throughout the clauses of the > "for" statement. I'll adjust that. > Another optimization that is possible is to do only one comparison at each > level (between the two children) all the way down the heap, and then > compare the parent to the chain of smallest children in reverse order. > This can substantially reduce the number of comparisons on average, > because most tuples sift most of the way down the heap (because most of the > tuples are at the bottom of the heap). Sounds interesting; I'll see if I can get that going. > (Also, I think you should make your siftdown code look more like the > existing siftdown code.) A function called by the outer loop? Can do; the existing does that because the function is called from multiple places but this will only be used the once. Thanks for the review. -- Jeremy
В списке pgsql-hackers по дате отправления: