Re: Add proper planner support for ORDER BY / DISTINCT aggregates
От | Ronan Dunklau |
---|---|
Тема | Re: Add proper planner support for ORDER BY / DISTINCT aggregates |
Дата | |
Msg-id | 4480689.ObhdGn8bVM@aivenronan обсуждение исходный текст |
Ответ на | Re: Add proper planner support for ORDER BY / DISTINCT aggregates (Ronan Dunklau <ronan.dunklau@aiven.io>) |
Ответы |
Re: Add proper planner support for ORDER BY / DISTINCT aggregates
|
Список | pgsql-hackers |
> Ok, I reproduced that case, just not using a group by: by adding the group > by a sort node is added in both cases (master and your patch), except that > with your patch we sort on both keys and that doesn't really incur a > performance penalty. > > I think the overhead occurs because in the ExecAgg case, we use the > tuplesort_*_datum API as an optimization when we have a single column as an > input, which the ExecSort code doesn't. Maybe it would be worth it to try to > use that API in sort nodes too, when it can be done. Please find attached a POC patch to do just that. The switch to the single-datum tuplesort is done when there is only one attribute, it is byval (to avoid having to deal with copy of the references everywhere) and we are not in bound mode (to also avoid having to move things around). A naive run on make check pass on this, but I may have overlooked things. Should I add this separately to the commitfest ? For the record, the times I got on my laptop, on master VS david's patch VS both. Values are an average of 100 runs, as reported by pgbench --no-vacuum -f <file.sql> -t 100. There is a good amount of noise, but the simple "select one ordered column case" seems worth the optimization. Only shared_buffers and work_mem have been set to 2GB each. Setup 1: single table, 1 000 000 tuples, no index CREATE TABLE tbench ( a int, b int ); INSERT INTO tbench (a, b) SELECT a, b FROM generate_series(1, 100) a, generate_series(1, 10000) b; Test 1: Single-column ordered select (order by b since the table is already sorted by a) select b from tbench order by b; master: 303.661ms with mine: 148.571ms Test 2: Ordered sum (using b so that the input is not presorted) select sum(b order by b) from tbench; master: 112.379ms with david's patch: 144.469ms with david's patch + mine: 97ms Test 3: Ordered sum + group by select b, sum(a order by a) from tbench GROUP BY b; master: 316.117ms with david's patch: 297.079 with david's patch + mine: 294.601 Setup 2: same as before, but adding an index on (b, a) CREATE INDEX ON tbench (b, a); Test 2: Ordered sum: select sum(a order by a) from tbench; master: 111.847 ms with david's patch: 48.088 with david's patch + mine: 47.678 ms Test 3: Ordered sum + group by: select a, sum(b order by b) from tbench GROUP BY a; master: 76.873 ms with david's patch: 61.105 with david's patch + mine: 62.672 ms -- Ronan Dunklau
Вложения
В списке pgsql-hackers по дате отправления: