Re: [PATCH] Use optimized single-datum tuplesort in ExecSort

Поиск

Список

Период

Сортировка

От	David Rowley
Тема	Re: [PATCH] Use optimized single-datum tuplesort in ExecSort
Дата	13 июля 2021 г. 03:15:24
Msg-id	CAApHDvrWV=v0qKsC9_BHqhCn9TusrNvCaZDz77StCO--fmgbKA@mail.gmail.com обсуждение исходный текст
Ответ на	Re: [PATCH] Use optimized single-datum tuplesort in ExecSort (Ronan Dunklau <ronan.dunklau@aiven.io>)
Ответы	Re: [PATCH] Use optimized single-datum tuplesort in ExecSort Re: [PATCH] Use optimized single-datum tuplesort in ExecSort
Список	pgsql-hackers

Дерево обсуждения

On Tue, 13 Jul 2021 at 01:59, Ronan Dunklau <ronan.dunklau@aiven.io> wrote:
> > 3. This seems to be a bug fix where byval datum sorts do not properly
> > handle bounded sorts.  I think that maybe that should be fixed and
> > backpatched.  I don't see anything that says Datum sorts can't be
> > bounded and if there were some restriction on that I'd expect
> > tuplesort_set_bound() to fail when the Tuplesortstate had been set up
> > with tuplesort_begin_datum().
>
> I've kept this as-is for now, but I will remove it from my patch if it is
> deemed worthy of back-patching in your other thread.

I've now pushed that bug fix so it's fine to remove the change to
tuplesort.c now.

I also did a round of benchmarking on this patch using the attached
script. Anyone wanting to run it will need to run make installcheck
first to create the required tables.

On an AMD machine, I got the following results.

Result in transactions per second.
Test master v5 patch compare
Test1 446.1 657.3 147.32%
Test2 315.8 314.0 99.44%
Test3 302.3 392.1 129.67%
Test4 232.7 230.7 99.12%
Test5 230.0 446.1 194.00%
Test6 199.5 217.9 109.23%
Test7 188.7 185.3 98.21%
Test8 385.4 544.0 141.17%

Tests 2, 4, 7 are designed to check if there is any regression from
doing the additional run-time checks to see if we're doing datumSort.
I measured a very small penalty from this. It's most visible in test7
with a drop of about 1.8%. Each test did OFFSET 1000000 as I didn't
want to measure the overhead of outputting tuples.

All the other tests show a pretty good gain. Test6 is testing a byref
type, so it appears the gains are not just from byval datums.

It would be good to see the benchmark script run on a few other
machines to get an idea if the gains and losses are consistent.

In theory, we likely could get rid of the small regression by having
two versions of ExecSort() and setting the correct one during
ExecInitSort() by setting the function pointer to the version we want
to use in sortstate->ss.ps.ExecProcNode. But maybe the small
regression is not worth going to that trouble over. I'm not aware of
any other executor nodes that have logic like that so maybe it would
be a bad idea to introduce something like that.

David

Вложения

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: [PATCH] Use optimized single-datum tuplesort in ExecSort

Вложения