Re: [PATCH] Use optimized single-datum tuplesort in ExecSort
От | David Rowley |
---|---|
Тема | Re: [PATCH] Use optimized single-datum tuplesort in ExecSort |
Дата | |
Msg-id | CAApHDvrWV=v0qKsC9_BHqhCn9TusrNvCaZDz77StCO--fmgbKA@mail.gmail.com обсуждение исходный текст |
Ответ на | Re: [PATCH] Use optimized single-datum tuplesort in ExecSort (Ronan Dunklau <ronan.dunklau@aiven.io>) |
Ответы |
Re: [PATCH] Use optimized single-datum tuplesort in ExecSort
Re: [PATCH] Use optimized single-datum tuplesort in ExecSort |
Список | pgsql-hackers |
On Tue, 13 Jul 2021 at 01:59, Ronan Dunklau <ronan.dunklau@aiven.io> wrote: > > 3. This seems to be a bug fix where byval datum sorts do not properly > > handle bounded sorts. I think that maybe that should be fixed and > > backpatched. I don't see anything that says Datum sorts can't be > > bounded and if there were some restriction on that I'd expect > > tuplesort_set_bound() to fail when the Tuplesortstate had been set up > > with tuplesort_begin_datum(). > > I've kept this as-is for now, but I will remove it from my patch if it is > deemed worthy of back-patching in your other thread. I've now pushed that bug fix so it's fine to remove the change to tuplesort.c now. I also did a round of benchmarking on this patch using the attached script. Anyone wanting to run it will need to run make installcheck first to create the required tables. On an AMD machine, I got the following results. Result in transactions per second. Test master v5 patch compare Test1 446.1 657.3 147.32% Test2 315.8 314.0 99.44% Test3 302.3 392.1 129.67% Test4 232.7 230.7 99.12% Test5 230.0 446.1 194.00% Test6 199.5 217.9 109.23% Test7 188.7 185.3 98.21% Test8 385.4 544.0 141.17% Tests 2, 4, 7 are designed to check if there is any regression from doing the additional run-time checks to see if we're doing datumSort. I measured a very small penalty from this. It's most visible in test7 with a drop of about 1.8%. Each test did OFFSET 1000000 as I didn't want to measure the overhead of outputting tuples. All the other tests show a pretty good gain. Test6 is testing a byref type, so it appears the gains are not just from byval datums. It would be good to see the benchmark script run on a few other machines to get an idea if the gains and losses are consistent. In theory, we likely could get rid of the small regression by having two versions of ExecSort() and setting the correct one during ExecInitSort() by setting the function pointer to the version we want to use in sortstate->ss.ps.ExecProcNode. But maybe the small regression is not worth going to that trouble over. I'm not aware of any other executor nodes that have logic like that so maybe it would be a bad idea to introduce something like that. David
Вложения
В списке pgsql-hackers по дате отправления: