Re: Sort Refinement
От | Simon Riggs |
---|---|
Тема | Re: Sort Refinement |
Дата | |
Msg-id | 1206190867.4285.758.camel@ebony.site обсуждение исходный текст |
Ответ на | Re: Sort Refinement (Sam Mason <sam@samason.me.uk>) |
Список | pgsql-hackers |
On Thu, 2008-03-20 at 21:34 +0000, Sam Mason wrote: > On Thu, Mar 20, 2008 at 05:17:22PM +0000, Simon Riggs wrote: > > Currently, our sort algorithm assumes that its input is unsorted. So if > > your data is sorted on (a) and you would like it to be sorted on (a,b) > > then we need to perform the full sort of (a,b). > > > > For small sorts this doesn't matter much. For larger sorts the heap sort > > algorithm will typically result in just a single run being written to > > disk which must then be read back in. Number of I/Os required is twice > > the total volume of data to be sorted. > > > > If we assume we use heap sort, then if we *know* that the data is > > presorted on (a) then we should be able to emit tuples directly that the > > value of (a) changes and keep emitting them until the heap is empty, > > since they will exit the heap in (a,b) order. > > We also have stats to help decide when this will be a win. For example > if "a" has a small range (i.e. a boolean) and "b" has a large range > (i.e. some sequence) then this probably isn't going to be a win and > you're better off using the existing infrastructure. If it's the other > way around then this is going to be a big win. Yep, sounds sensible. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk
В списке pgsql-hackers по дате отправления: