Re: [HACKERS] The case for removing replacement selection sort
От | Tomas Vondra |
---|---|
Тема | Re: [HACKERS] The case for removing replacement selection sort |
Дата | |
Msg-id | e7b074f5-77cf-0d69-b3a7-9da1d2893f63@2ndquadrant.com обсуждение исходный текст |
Ответ на | Re: [HACKERS] The case for removing replacement selection sort (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
Список | pgsql-hackers |
Attached is a spreadsheet with updated data (using the more accurate timing, and comparing master with the default replacement_sort_tuples value (150k) and increased per Peter's instructions (to 1B). There are 6 sheets - one for each combination of a dataset size (100k and 1M) and machine where I ran the tests (with different CPU models). There are 5 columns - first three are medians for each of the tested PostgreSQL configurations: - master (default) - master (1billion) - no-replacement-sort (with patch applied) The numbers are medians from 5 runs (perhaps minimums would be better in this case). The last two columns are comparisons / speed-ups - master(1B) / master(default) - no-replacement-sort / master(default) Green numbers (below 1.0) mean speed-up, red (above 1.0) slow-down. Firstly, the master with r_s_t=1B setting performs either the same or worse compared to a default values, in almost every test. On the 100k data set the results are a bit noisy (particularly on the oldest CPU), but on the 1M data set the difference is quite clear. So I've only compared results for master(default) and patched. Quick summary, for each CPU model (which clearly affects the behavior). e5-2620-v4 ---------- - probably the CPU we should be looking at, as it's the current model - in some cases this gives us 3-5x speedup with low work_mem settings - consider for example the very last line, which shows that SELECT DISTINCT a FROM text_test_padding ORDER BY a completed in ~5531ms on work_mem=8MB and 1067ms on 32MB, but with the patch it completes in 1784ms (1MB), 1211ms (4MB) and 1104 (8MB). - Of course, this is for already-sorted data, and for other data sets the improvement is more modest. It's difficult to summarize this into a single number, but there are plenty of improvements in 20-30% range. - Some cases are a bit slower, of course, but overall I'd say the chart is much more green than red. Also the slow-downs are much smaller, compared to speed-ups (generally within 1-5%). i5-2500k -------- - same story, but this CPU gives more stable results (less noise) e5-5450 ------- - rather noisy CPU, particularly on the small (100k) dataset - the larger data set mostly matches the other CPUs, although the regressions are somewhat more significant - I wouldn't really worry about this too much, it's clearly an obsolete CPU and not something performance-conscious person would use nowadays (the other CPUs are often 2-3x faster). If needed, full data is available here (each machine is pushing data to a separate git repository): * https://bitbucket.org/tvondra/sort-bench-i5-2500k * https://bitbucket.org/tvondra/sort-bench-e5-2620-v4 * https://bitbucket.org/tvondra/sort-bench-e5-5450 At this point the 10M row tests are running, but I don't expect anything particularly surprising from those results. That is, it's not something that should block getting this patch committed, if the agreement is to commit otherwise. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Вложения
В списке pgsql-hackers по дате отправления: