Re: Using quicksort for every external sort run
От | Mithun Cy |
---|---|
Тема | Re: Using quicksort for every external sort run |
Дата | |
Msg-id | CAD__OuhgUbZgszr-Ftf=bqgz-YwD60T5xMOikvXYiBQUCNpPqQ@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: Using quicksort for every external sort run (Peter Geoghegan <pg@heroku.com>) |
Ответы |
Re: Using quicksort for every external sort run
Re: Using quicksort for every external sort run |
Список | pgsql-hackers |
On Tue, Dec 29, 2015 at 4:33 AM, Peter Geoghegan <pg@heroku.com> wrote:
>Attached is a revision that significantly overhauls the memory patch,
>with several smaller changes.
>with several smaller changes.
I just ran some tests on above patch. Mainly to compare
how "longer sort keys" would behave with new(Qsort) and old Algo(RS) for sorting.
I have 8GB of ram and ssd storage.
Settings and Results.
----------------------------
Work_mem= DEFAULT (4mb).
key width = 520.
CASE 1. Data is pre-sorted as per sort key order.
CASE 2. Data is sorted in opposite order of sort key.
CASE 3. Data is randomly distributed.
Key length 520 | ||||
Number of records | 3200000 | 6400000 | 12800000 | 25600000 |
1.7 GB | 3.5GB | 7 GB | 14GB | |
CASE 1 | ||||
RS | 23654.677 | 35172.811 | 44965.442 | 106420.155 |
Qsort | 14100.362 | 40612.829 | 101068.107 | 334893.391 |
CASE 2 | ||||
RS | 13427.378 | 36882.898 | 98492.644 | 310670.15 |
Qsort | 12475.133 | 32559.074 | 100772.531 | 322080.602 |
CASE 3 | ||||
RS | 17202.966 | 45163.234 | 122323.299 | 337058.856 |
Qsort | 12530.726 | 23343.753 | 59431.315 | 152862.837 |
If data is sorted as same as sort key order then current code performs better than proposed patch
as sort size increases.
It appears new algo do not seem have any major impact if rows are presorted in opposite order.
For randomly distributed order quick sort performs well when compared to current sort method (RS).
======================================================
Now Increase the work_mem to 64MB and for 14 GB of data to sort.
CASE 1: We can see Qsort is able to catchup with current sort method(RS).
CASE 2: No impact.
CASE 3: RS is able to catchup with Qsort.
CASE 1 | RS | 128822.735 |
Qsort | 90857.496 |
CSAE 2 | RS | 105631.775 |
Qsort | 105938.334 |
CASE 3 | RS | 152301.054 |
Qsort | 149649.347 |
I think for long keys both old (RS) and new (Qsort) sort method has its own characteristics
based on data distribution. I think work_mem is the key If properly set new method(Qsort) will
be able to fit most of the cases. If work_mem is not tuned right it, there are cases it can regress.
Вложения
В списке pgsql-hackers по дате отправления: