Re: Timsort performance, quicksort (was: Re: Memory usage during sorting)
От | Peter Geoghegan |
---|---|
Тема | Re: Timsort performance, quicksort (was: Re: Memory usage during sorting) |
Дата | |
Msg-id | CAEYLb_UaXsPcBRq4N=h+Ddq03UX438nPxvOr=6CAhmwQuVJWZQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Timsort performance, quicksort (was: Re: Memory usage during sorting) (Robert Haas <robertmhaas@gmail.com>) |
Ответы |
Re: Timsort performance, quicksort (was: Re: Memory usage
during sorting)
|
Список | pgsql-hackers |
On 24 April 2012 16:17, Robert Haas <robertmhaas@gmail.com> wrote: > If they are in sorted order with an empty string > appended onto the end, it takes about 25 seconds. That's exactly what I'd have expected, but was surprised to have not found with my own test. Perhaps it was same kind of fluke (i.e. a re-creatable one - I'm quite confident that my benchmark was not methodologically flawed, at least in execution). > Based on that, I'm inclined to propose rejiggering things so that the > presorted-input check runs only at the top level, and not during any > recursive steps. The theory is that it won't cost much because if the > data is randomly ordered we won't do many comparisons before falling > out, and that seems to be true. But the only point of doing it in the > recursive steps is to win when the data is partially ordered, and we > actually seem to be losing big-time in that case, perhaps because when > the data is partially ordered, the presort check will frequently to > run through a significant part of the array - wasting cycles - but > fall out before it actually gets to the end. That makes sense to me, but obviously more data is needed here. > Of course, even if we did that, it might not be as good as your > timsort numbers, but that doesn't mean we shouldn't do it... The frustrating thing about my timsort numbers, as I mentioned in reply to Dimitri (He modified the subject a bit, so that might appear to be a different thread to you), is that they appear to be almost consistently winning when you consider the number of comparisons, but in fact lose when you measure the duration of execution or TPS or whatever. I expected a certain amount of this, but not enough to entirely derail the case for replacing quicksort with timsort when sorting a single key of text, which is obviously the really compelling case for optimisation here. This situation is only going to be made "worse" by the work you've done on SortSupport for text, which, incidentally, I agree is worthwhile. -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
В списке pgsql-hackers по дате отправления: