Re: some aspects of our qsort might not be ideal
От | John Naylor |
---|---|
Тема | Re: some aspects of our qsort might not be ideal |
Дата | |
Msg-id | CAFBsxsHWituORSTKvvAL7bmDmQ-qy580Mnq=7BFrPY7-0qg_NQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: some aspects of our qsort might not be ideal (Peter Geoghegan <pg@bowt.ie>) |
Ответы |
Re: some aspects of our qsort might not be ideal
|
Список | pgsql-hackers |
On Fri, Jun 3, 2022 at 10:44 AM Peter Geoghegan <pg@bowt.ie> wrote: > What about dual-pivot quicksort, which is used in Java 7+? That is the > defacto successor to Bentley & McIlroy. In fact, Jon Bentley himself > collaborated with its author, and provided benchmarking input. The > underlying philosophy is essentially the same as the original -- it > is supposed to be an "industrial strength" quicksort, with various > adversarial cases considered directly. Here is a *rough* first pass at dual-pivot quicksort. I haven't changed the regression tests to adjust for qsort being an unstable sort, and there is some dead code. I looked to a couple JDKs for examples of design decisions as a first approximation. It includes a module with a few debugging printouts, run like so: select debug_qsort_int(array[7,6,5,4,3,2,1]); Pivot selection: A common way is to pick five candidates (here: mid, +/- 1/7, +/- 2/7), sort them in place, then pick the 2nd and 4th of them, or "tertile of five". If they are all evenly spaced, we can do insertion sort. Fall back to single-pivot: If any of the pivot candidates are equal, JDK assumes there could be a large number of duplicates and falls back to single-pivot, using the median of the five. I've done the same here. Despite having two code paths in all of our template instances, the VM binary is only about 5kb bigger, since MED3 is no longer built. Performance testing: Not started yet. I'm thinking an apples-to-apples comparison is not insightful enough, since the current insertion sort threshold 7 is already a bit constrained for single-pivot, and would be even more so for dual pivot, especially since 5 of the elements are pre-sorted to choose the pivots. My plan is to retest the threshold on HEAD using my latest tests, then use that as a starting point to test thresholds with dual-pivot. -- John Naylor EDB: http://www.enterprisedb.com
Вложения
В списке pgsql-hackers по дате отправления: