Re: Using quicksort for every external sort run

Поиск

Список

Период

Сортировка

От	Robert Haas
Тема	Re: Using quicksort for every external sort run
Дата	29 марта 2016 г. 16:11:32
Msg-id	CA+TgmoaXYtofbXdjzBneEjsx8a1Z9A+TVB1mSTUFAnrdNu=BTA@mail.gmail.com обсуждение исходный текст
Ответ на	Re: Using quicksort for every external sort run (Peter Geoghegan <pg@heroku.com>)
Ответы	Re: Using quicksort for every external sort run
Список	pgsql-hackers

Дерево обсуждения

On Mon, Mar 28, 2016 at 11:18 PM, Peter Geoghegan <pg@heroku.com> wrote:
> Note that amcheck V2, which I posted just now features tests for
> external sorting. The way these work requires discussion. The tests
> are motivated in part by the recent strxfrm() debacle, as well as by
> the need to have at least some test coverage for this patch. It's bad
> that external sorting currently has no test coverage. We should try
> and do better there as part of this overhaul to tuplesort.c.

Test coverage is good!

However, I don't see that you've responded to Tomas Vondra's report of
regressions.  Maybe you're waiting for more data from him, but we're
running out of time here.  I think what we need to decide is whether
these results are bad enough that the patch needs more work on the
regressed cases, or whether we're comfortable with some regressions in
low-memory configurations for the benefit of higher-memory
configurations.  I'm kind of on the fence about that, myself.

One test that kind of bothers me in particular is the "SELECT DISTINCT
a FROM numeric_test ORDER BY a" test on the high_cardinality_random
data set.  That's a wash at most work_mem values, but at 32MB it's
more than 3x slower.  That's very strange, and there are a number of
other results like that, where one particular work_mem value triggers
a large regression.  That's worrying.

Also, it's pretty clear that the patch has more large wins than it
does large losses, but it seems pretty easy to imagine people who
haven't tuned any GUCs writing in to say that 9.6 is way slower on
their workload, because those people are going to be at work_mem=4MB,
maintenance_work_mem=64MB.  At those numbers, if Tomas's data is
representative, it's not hard to imagine that the number of people who
see a significant regression might be quite a bit larger than the
number who see a significant speedup.

On the whole, I'm tempted to say this needs more work before we commit
to it, but I'd like to hear other opinions on that point.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: Using quicksort for every external sort run