Re: Todo: Teach planner to evaluate multiple windows in the optimal order
От | Ankit Kumar Pandey |
---|---|
Тема | Re: Todo: Teach planner to evaluate multiple windows in the optimal order |
Дата | |
Msg-id | 8d0b24ce-f8ba-cd63-82ae-be853b4067e8@gmail.com обсуждение исходный текст |
Ответ на | Re: Todo: Teach planner to evaluate multiple windows in the optimal order (David Rowley <dgrowleyml@gmail.com>) |
Ответы |
Re: Todo: Teach planner to evaluate multiple windows in the optimal order
(David Rowley <dgrowleyml@gmail.com>)
|
Список | pgsql-hackers |
> On 08/01/23 16:33, David Rowley wrote: > On Sun, 8 Jan 2023 at 23:21, Ankit Kumar Pandey <itsankitkp@gmail.com> wrote: >> Please find attached patch with addressed issues mentioned before. > > Here's a quick review: > > 1. You can use foreach_current_index(l) to obtain the index of the list element. > > 2. I'd rather see you looping backwards over the list in the first > loop. I think it'll be more efficient to loop backwards as you can > just break out the loop when you see the pathkeys are not contained in > the order by pathkreys. When the optimisation does not apply that > means you only need to look at the last item in the list. You could > maybe just invent foreach_reverse() for this purpose and put it in > pg_list.h. That'll save having to manually code up the loop. > > 3. I don't think you should call the variable > enable_order_by_pushdown. We've a bunch of planner related GUCs that > start with enable_. That might cause a bit of confusion. Maybe just > try_sort_pushdown. > > 4. You should widen the scope of orderby_pathkeys and set it within > the if (enable_order_by_pushdown) scope. You can reuse this again in > the 2nd loop too. Just initialise it to NIL > > 5. You don't seem to be checking all WindowClauses for a runCondtion. > If you do #2 above then you can start that process in the initial > reverse loop so that you've checked them all by the time you get > around to that WindowClause again in the 2nd loop. > > 6. The test with "+-- Do not perform additional sort if column is > presorted". I don't think "additional" is the correct word here. I > think you want to ensure that we don't push down the ORDER BY below > the WindowAgg for this case. There is no ability to reduce the sorts > here, only move them around, which we agreed we don't want to do for > this case. I have addressed all points 1-6 in the attached patch. I have one doubt regarding runCondition, do we only need to ensure that window function which has subset sort clause of main query should not have runCondition or none of the window functions should not contain runCondition? I have gone with later case but wanted to clarify. > Also, do you want to have a go at coding up the sort bound pushdown > too? It'll require removing the limitCount restriction and adjusting > ExecSetTupleBound() to recurse through a WindowAggState. I think it's > pretty easy. You could try it then play around with it to make sure it > works and we get the expected performance. I tried this in the patch but kept getting `retrieved too many tuples in a bounded sort`. Added following code in ExecSetTupleBound which correctly found sortstate and set bound value. else if(IsA(child_node, WindowAggState)) { WindowAggState *winstate = (WindowAggState *) child_node; if (outerPlanState(winstate)) ExecSetTupleBound(tuples_needed, outerPlanState(winstate)); } I think problem is that are not using limit clause inside window function (which may need to scan all tuples) so passing bound value to WindowAggState->sortstate is not working as we might expect. Or maybe I am getting it wrong? I was trying to have top-N sort for limit clause if orderby pushdown happens. > You'll likely want to add a few more rows than the last performance test you did or run the query > with pgbench. Running a query once that only takes 1-2ms is likely not > a reliable way to test the performance of something. I did some profiling. CREATE TABLE empsalary1 ( depname varchar, empno bigint, salary int, enroll_date date ); INSERT INTO empsalary1(depname, empno, salary, enroll_date) SELECT string_agg (substr('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789', ceil (random() * 62)::integer,1000), '') AS depname, generate_series(1, 10000000) AS empno, ceil (random()*10000)::integer AS salary , NOW() + (random() * (interval '90 days')) + '30 days' AS enroll_date; 1) Optimization case EXPLAIN (ANALYZE) SELECT empno, depname, min(salary) OVER (PARTITION BY depname ORDER BY empno) depminsalary, sum(salary) OVER (PARTITION BY depname) depsalary, count(*) OVER (ORDER BY enroll_date DESC) c FROM empsalary1 ORDER BY depname, empno, enroll_date; EXPLAIN (ANALYZE) SELECT empno, depname, min(salary) OVER (PARTITION BY depname) depminsalary FROM empsalary1 ORDER BY depname, empno; In patched version: QUERY PLAN -------------------------------------------------------------------------------------------------------------------------- ----------------------- WindowAgg (cost=93458.04..93458.08 rows=1 width=61) (actual time=149996.006..156756.995 rows=10000000 loops=1) -> WindowAgg (cost=93458.04..93458.06 rows=1 width=57) (actual time=108559.126..135892.188 rows=10000000 loops=1) -> Sort (cost=93458.04..93458.04 rows=1 width=53) (actual time=108554.213..112564.168 rows=10000000 loops=1) Sort Key: depname, empno, enroll_date Sort Method: external merge Disk: 645856kB -> WindowAgg (cost=93458.01..93458.03 rows=1 width=53) (actual time=30386.551..62357.669 rows=10000000lo ops=1) -> Sort (cost=93458.01..93458.01 rows=1 width=45) (actual time=23260.104..26313.395 rows=10000000l oops=1) Sort Key: enroll_date DESC Sort Method: external merge Disk: 528448kB -> Seq Scan on empsalary1 (cost=0.00..93458.00 rows=1 width=45) (actual time=0.032..4833.603 rows=10000000 loops=1) Planning Time: 4.693 ms Execution Time: 158161.281 ms QUERY PLAN -------------------------------------------------------------------------------------------------------------------------- ------------- WindowAgg (cost=1903015.63..2078015.74 rows=10000006 width=39) (actual time=40565.305..46598.984 rows=10000000 loops=1) -> Sort (cost=1903015.63..1928015.65 rows=10000006 width=39) (actual time=23411.837..27467.962 rows=10000000 loops=1) Sort Key: depname, empno Sort Method: external merge Disk: 528448kB -> Seq Scan on empsalary1 (cost=0.00..193458.06 rows=10000006 width=39) (actual time=5.095..5751.675 rows=10000 000 loops=1) Planning Time: 0.099 ms Execution Time: 47415.926 ms In master: QUERY PLAN -------------------------------------------------------------------------------------------------------------------------- ------------------------------------- Incremental Sort (cost=3992645.36..4792645.79 rows=10000006 width=59) (actual time=147130.132..160985.373 rows=10000000 loops=1) Sort Key: depname, empno, enroll_date Presorted Key: depname, empno Full-sort Groups: 312500 Sort Method: quicksort Average Memory: 28kB Peak Memory: 28kB -> WindowAgg (cost=3992645.31..4342645.52 rows=10000006 width=59) (actual time=147129.936..154023.147 rows=10000000l oops=1) -> WindowAgg (cost=3992645.31..4192645.43 rows=10000006 width=55) (actual time=104665.289..133089.188 rows=1000 0000 loops=1) -> Sort (cost=3992645.31..4017645.33 rows=10000006 width=51) (actual time=104665.257..108710.282 rows=100 00000 loops=1) Sort Key: depname, empno Sort Method: external merge Disk: 645856kB -> WindowAgg (cost=1971370.63..2146370.74 rows=10000006 width=51) (actual time=28314.300..59737.949 rows=10000000 loops=1) -> Sort (cost=1971370.63..1996370.65 rows=10000006 width=43) (actual time=21190.188..24098.59 6 rows=10000000 loops=1) Sort Key: enroll_date DESC Sort Method: external merge Disk: 528448kB -> Seq Scan on empsalary1 (cost=0.00..193458.06 rows=10000006 width=43) (actual time=0. 630..5317.862 rows=10000000 loops=1) Planning Time: 0.982 ms Execution Time: 163369.242 ms (16 rows) QUERY PLAN -------------------------------------------------------------------------------------------------------------------------- ------------------- Incremental Sort (cost=3787573.31..3912573.41 rows=10000006 width=39) (actual time=51547.195..53950.034 rows=10000000lo ops=1) Sort Key: depname, empno Presorted Key: depname Full-sort Groups: 1 Sort Method: quicksort Average Memory: 30kB Peak Memory: 30kB Pre-sorted Groups: 1 Sort Method: external merge Average Disk: 489328kB Peak Disk: 489328kB -> WindowAgg (cost=1903015.63..2078015.74 rows=10000006 width=39) (actual time=33413.954..39771.262 rows=10000000 loo ps=1) -> Sort (cost=1903015.63..1928015.65 rows=10000006 width=39) (actual time=18991.129..21992.353 rows=10000000lo ops=1) Sort Key: depname Sort Method: external merge Disk: 528456kB -> Seq Scan on empsalary1 (cost=0.00..193458.06 rows=10000006 width=39) (actual time=1.300..5269.729 rows =10000000 loops=1) Planning Time: 4.506 ms Execution Time: 54768.697 ms 2) No optimization case: EXPLAIN (ANALYZE) SELECT empno, depname, min(salary) OVER (PARTITION BY depname ORDER BY empno) depminsalary FROM empsalary1 ORDER BY enroll_date; Patch: QUERY PLAN -------------------------------------------------------------------------------------------------------------------------- -------------------------------- Sort (cost=3968191.79..3993191.80 rows=10000006 width=43) (actual time=57863.173..60976.324 rows=10000000 loops=1) Sort Key: enroll_date Sort Method: external merge Disk: 528448kB -> WindowAgg (cost=850613.62..2190279.21 rows=10000006 width=43) (actual time=7478.966..42502.541 rows=10000000 loops =1) -> Gather Merge (cost=850613.62..2015279.11 rows=10000006 width=43) (actual time=7478.935..18037.001 rows=10000 000 loops=1) Workers Planned: 2 Workers Launched: 2 -> Sort (cost=849613.60..860030.27 rows=4166669 width=43) (actual time=7349.101..9397.713 rows=3333333 lo ops=3) Sort Key: depname, empno Sort Method: external merge Disk: 181544kB Worker 0: Sort Method: external merge Disk: 169328kB Worker 1: Sort Method: external merge Disk: 177752kB -> Parallel Seq Scan on empsalary1 (cost=0.00..135124.69 rows=4166669 width=43) (actual time=0.213. .2450.635 rows=3333333 loops=3) Planning Time: 0.100 ms Execution Time: 63341.783 ms master: QUERY PLAN -------------------------------------------------------------------------------------------------------------------------- -------------------------------- Sort (cost=3968191.79..3993191.80 rows=10000006 width=43) (actual time=54097.880..57000.806 rows=10000000 loops=1) Sort Key: enroll_date Sort Method: external merge Disk: 528448kB -> WindowAgg (cost=850613.62..2190279.21 rows=10000006 width=43) (actual time=7075.245..39200.756 rows=10000000 loops =1) -> Gather Merge (cost=850613.62..2015279.11 rows=10000006 width=43) (actual time=7075.217..15988.922 rows=10000 000 loops=1) Workers Planned: 2 Workers Launched: 2 -> Sort (cost=849613.60..860030.27 rows=4166669 width=43) (actual time=6993.974..8799.701 rows=3333333 lo ops=3) Sort Key: depname, empno Sort Method: external merge Disk: 171904kB Worker 0: Sort Method: external merge Disk: 178496kB Worker 1: Sort Method: external merge Disk: 178224kB -> Parallel Seq Scan on empsalary1 (cost=0.00..135124.69 rows=4166669 width=43) (actual time=0.044. .2683.598 rows=3333333 loops=3) Planning Time: 5.718 ms Execution Time: 59188.469 ms (15 rows) Master and patch have same performance as plan is same. pgbench (this is to find average performance): create table empsalary2 as select * from empsalary1 limit 1000; ------------------------------------------------------------------- test.sql SELECT empno, depname, min(salary) OVER (PARTITION BY depname ORDER BY empno) depminsalary, sum(salary) OVER (PARTITION BY depname) depsalary, count(*) OVER (ORDER BY enroll_date DESC) c FROM empsalary2 ORDER BY depname, empno, enroll_date; SELECT empno, depname, min(salary) OVER (PARTITION BY depname) depminsalary FROM empsalary2 ORDER BY depname, empno; ---------------------------------------------------------------------- /usr/local/pgsql/bin/pgbench -d test -c 10 -j 4 -t 1000 -f test.sql Patch: transaction type: test.sql scaling factor: 1 query mode: simple number of clients: 10 number of threads: 4 maximum number of tries: 1 number of transactions per client: 1000 number of transactions actually processed: 10000/10000 number of failed transactions: 0 (0.000%) latency average = 55.262 ms initial connection time = 8.480 ms tps = 180.957685 (without initial connection time) Master: transaction type: test.sql scaling factor: 1 query mode: simple number of clients: 10 number of threads: 4 maximum number of tries: 1 number of transactions per client: 1000 number of transactions actually processed: 10000/10000 number of failed transactions: 0 (0.000%) latency average = 60.489 ms initial connection time = 7.069 ms tps = 165.320205 (without initial connection time) TPS of patched version is higher than that of master's for same set of queries. where optimization is performed. -- Regards, Ankit Kumar Pandey
Вложения
В списке pgsql-hackers по дате отправления:
Предыдущее
От: Ankit Kumar PandeyДата:
Сообщение: Re: Todo: Teach planner to evaluate multiple windows in the optimal order
Следующее
От: Mason SharpДата:
Сообщение: Re: POC: Lock updated tuples in tuple_update() and tuple_delete()