Considering additional sort specialisation functions for PG16
От | David Rowley |
---|---|
Тема | Considering additional sort specialisation functions for PG16 |
Дата | |
Msg-id | CAApHDvoTTtoQYfp3d0kTPF6y1pjexgLwquzKmjzvjC9NCw4RGw@mail.gmail.com обсуждение исходный текст |
Ответы |
Re: Considering additional sort specialisation functions for PG16
|
Список | pgsql-hackers |
Hi, (Background: 697492434 added 3 new sort functions to remove the indirect function calls for the comparator function. This sped up sorting for various of our built-in data types.) There was a bit of unfinished discussion around exactly how far to take these specialisations for PG15. We could certainly add more. There are various other things we could do to further speed up sorting for these datatypes. One example is, we could add 3 more variations of these functions that can be called when there are no NULL datums to sort. That effectively multiplies the number of specialisations by 2, or adds another dimension. I have the following dimensions in mind for consideration: 1. Specialisations to handle sorting of non-null datums (eliminates checking for nulls in the comparison function) 2. Specialisations to handle single column sorts (eliminates tiebreaker function call or any checks for existence of tiebreaker) 3. ASC sort (No need for if (ssup->ssup_reverse) INVERT_COMPARE_RESULT(compare)) If we did all of the above then we'd end up with 3 * 2 * 2 * 2 = 24 specialization functions. That seems a bit excessive. So here I'd like to discuss which ones we should add, if any. I've attached a very basic implementation of #1 which adds 3 new functions for sorting non-null datums. This could be made a bit more advanced. For now, I just added a bool flag to track if we have any NULL datum1s in memtuples[]. For bounded sorts, we may remove NULLs from that array, and may end up with no nulls after having seen null. So maybe a count would be better than a flag. A quick performance test with 1 million random INTs shows ~6% performance improvement when there are no nulls. Master $ pgbench -n -f bench.sql -T 60 postgres latency average = 159.837 ms latency average = 161.193 ms latency average = 159.512 ms master + not_null_sort_specializations.patch $ pgbench -n -f bench.sql -T 60 postgres latency average = 150.791 ms latency average = 149.843 ms latency average = 150.319 ms I didn't test for any regression when there are NULLs and we're unable to use the new specializations. I'm hoping the null tracking will be almost free, but I will need to check. It's all quite subjective to know which specializations should be added. I think #1 is likely to have the biggest wins when it can be used as it removes the most branching in the comparator function, however the biggest gains are not the only thing to consider. We also need to consider how commonly these functions will be used. I don't have any information about that. David
Вложения
В списке pgsql-hackers по дате отправления: