Re: Randomisation for ensuring nlogn complexity in quicksort
От | Claudio Freire |
---|---|
Тема | Re: Randomisation for ensuring nlogn complexity in quicksort |
Дата | |
Msg-id | CAGTBQpbWh9FZcv0wOWuKSeFDoCdzieG7d7M_spQTA5JRU6nhbQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Randomisation for ensuring nlogn complexity in quicksort (Robert Haas <robertmhaas@gmail.com>) |
Ответы |
Re: Randomisation for ensuring nlogn complexity in quicksort
|
Список | pgsql-hackers |
On Mon, Jul 1, 2013 at 5:12 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Mon, Jul 1, 2013 at 3:54 PM, Claudio Freire <klaussfreire@gmail.com> wrote: >> On Mon, Jul 1, 2013 at 4:32 PM, Robert Haas <robertmhaas@gmail.com> wrote: >>>> This shouldn't be too complex, and should give us a fixed nlogn complexity even for wild data sets, without affectingexisting normal data sets that are present in every day transactions. I even believe that those data sets will alsobenefit from the above optimisation. >>> >>> The only method of selecting a pivot for quicksort that obtain O(n lg >>> n) run time with 100% certainty is have a magical oracle inside the >>> computer that tells you in fixed time and with perfect accuracy which >>> pivot you should select. >> >> Doesn't a linear median algorithm (say median of medians) get you O(n lg n)? >> >> Granted, with a huge constant (I think 4)... but it should still be O(n lg n). > > No. Thinking about this a little more, I believe the way it works > out is that any algorithm for picking the median that guarantees that > a certain *percentage* of the tuples will be in the smaller partition > will have O(n lg n) complexity, but any algorithm that only guarantees > that a fixed *number* of tuples in the smaller partition is still > quadratic in complexity. In the case of a median algorithm, you're > only guaranteed to have 2 elements in the smaller partition, which is > a constant. If you take a median of medians, you're guaranteed to > have 8 elements in the smaller partition, which is bigger, but still a > constant. What? A median of medians algorithm will guarantee floor(N/2) elements on the smaller. That's the definition of median. Note that I'm referring to picking the actual median of all tuples, not just a sample. That's slow, but it guarantees O(n log n).
В списке pgsql-hackers по дате отправления: