Re: Todo: Teach planner to evaluate multiple windows in the optimal order
От | Ankit Kumar Pandey |
---|---|
Тема | Re: Todo: Teach planner to evaluate multiple windows in the optimal order |
Дата | |
Msg-id | 4303947c-c313-ec43-d442-80c4c8e92490@gmail.com обсуждение исходный текст |
Ответ на | Re: Todo: Teach planner to evaluate multiple windows in the optimal order (David Rowley <dgrowleyml@gmail.com>) |
Ответы |
Re: Todo: Teach planner to evaluate multiple windows in the optimal order
(David Rowley <dgrowleyml@gmail.com>)
|
Список | pgsql-hackers |
> On 13/01/23 07:48, David Rowley wrote: > It would just care if the > pathkeys were present and return a list of pathkeys not contained so > that an incremental sort could be done only on the returned list and a > Unique on an empty returned list. In create_final_distinct_paths, presorted keys are determined from input_rel->pathlist & needed_pathkeys. Problem with input_rel->pathlist is that, for index node, useful_pathkeys is stored in input_rel->pathlist but this useful_pathkeys is determined from truncate_useless_pathkeys(index_pathkeys) which removes index_keys if ordering is different. Hence, input_rel->pathlist returns null for select distinct b,a from ab where a < 10; even if index is created on a. In order to tackle this, I have added index_pathkeys in indexpath node itself. Although I started this patch from master, I merged changes to window sort optimizations. In patched version: set enable_hashagg=0; set enable_seqscan=0; explain select distinct relname,relkind,count(*) over (partition by relkind) from pg_Class; QUERY PLAN ------------------------------------------------------------------------------------------------------- Unique (cost=10000000039.49..10000000063.73 rows=415 width=73) -> Incremental Sort (cost=10000000039.49..10000000060.62 rows=415 width=73) Sort Key: relkind, relname, (count(*) OVER (?)) Presorted Key: relkind -> WindowAgg (cost=10000000034.20..10000000041.46 rows=415 width=73) -> Sort (cost=10000000034.20..10000000035.23 rows=415 width=65) Sort Key: relkind -> Seq Scan on pg_class (cost=10000000000.00..10000000016.15 rows=415 width=65) (8 rows) explain select distinct b,a from ab where a < 10; QUERY PLAN ---------------------------------------------------------------------------------- Unique (cost=0.71..60.05 rows=611 width=8) -> Incremental Sort (cost=0.71..55.55 rows=900 width=8) Sort Key: a, b Presorted Key: a -> Index Scan using ab_a_idx on ab (cost=0.29..28.04 rows=900 width=8) Index Cond: (a < 10) (6 rows) explain select distinct b,a, count(*) over (partition by a) from abcd order by a,b; QUERY PLAN ----------------------------------------------------------------------------------------------------- Unique (cost=10000021174.63..10000038095.75 rows=60 width=16) -> Incremental Sort (cost=10000021174.63..10000036745.75 rows=180000 width=16) Sort Key: a, b, (count(*) OVER (?)) Presorted Key: a, b -> WindowAgg (cost=10000020948.87..10000024098.87 rows=180000 width=16) -> Sort (cost=10000020948.87..10000021398.87 rows=180000 width=8) Sort Key: a, b -> Seq Scan on abcd (cost=10000000000.00..10000002773.00 rows=180000 width=8) (8 rows) explain select distinct a, b, count(*) over (partition by a,b,c) from abcd; QUERY PLAN ------------------------------------------------------------------------------------------------------ Unique (cost=10000021580.47..10000036629.31 rows=60 width=20) -> Incremental Sort (cost=10000021580.47..10000035279.31 rows=180000 width=20) Sort Key: a, b, c, (count(*) OVER (?)) Presorted Key: a, b, c -> WindowAgg (cost=10000021561.37..10000025611.37 rows=180000 width=20) -> Sort (cost=10000021561.37..10000022011.37 rows=180000 width=12) Sort Key: a, b, c -> Seq Scan on abcd (cost=10000000000.00..10000002773.00 rows=180000 width=12) (8 rows) explain select distinct a, b, count(*) over (partition by b,a, c) from abcd; QUERY PLAN --------------------------------------------------------------------------------------------------- Unique (cost=2041.88..36764.90 rows=60 width=20) -> Incremental Sort (cost=2041.88..35414.90 rows=180000 width=20) Sort Key: b, a, c, (count(*) OVER (?)) Presorted Key: b, a, c -> WindowAgg (cost=1989.94..25746.96 rows=180000 width=20) -> Incremental Sort (cost=1989.94..22146.96 rows=180000 width=12) Sort Key: b, a, c Presorted Key: b -> Index Scan using b_idx on abcd (cost=0.29..7174.62 rows=180000 width=12) (9 rows) In master: explain select distinct relname,relkind,count(*) over (partition by relkind) from pg_Class; QUERY PLAN ------------------------------------------------------------------------------------------------------- Unique (cost=10000000059.50..10000000063.65 rows=415 width=73) -> Sort (cost=10000000059.50..10000000060.54 rows=415 width=73) Sort Key: relname, relkind, (count(*) OVER (?)) -> WindowAgg (cost=10000000034.20..10000000041.46 rows=415 width=73) -> Sort (cost=10000000034.20..10000000035.23 rows=415 width=65) Sort Key: relkind -> Seq Scan on pg_class (cost=10000000000.00..10000000016.15 rows=415 width=65) (7 rows) explain select distinct b,a from ab where a < 10; QUERY PLAN ---------------------------------------------------------------------------------- Unique (cost=72.20..78.95 rows=611 width=8) -> Sort (cost=72.20..74.45 rows=900 width=8) Sort Key: b, a -> Index Scan using ab_a_idx on ab (cost=0.29..28.04 rows=900 width=8) Index Cond: (a < 10) (5 rows) explain select distinct b,a, count(*) over (partition by a) from abcd order by a,b; QUERY PLAN ----------------------------------------------------------------------------------------------------- Unique (cost=10000023704.77..10000041084.40 rows=60 width=16) -> Incremental Sort (cost=10000023704.77..10000039734.40 rows=180000 width=16) Sort Key: a, b, (count(*) OVER (?)) Presorted Key: a -> WindowAgg (cost=10000020948.87..10000024098.87 rows=180000 width=16) -> Sort (cost=10000020948.87..10000021398.87 rows=180000 width=8) Sort Key: a -> Seq Scan on abcd (cost=10000000000.00..10000002773.00 rows=180000 width=8) (8 rows) explain select distinct a, b, count(*) over (partition by b,a, c) from abcd; QUERY PLAN --------------------------------------------------------------------------------------------------- Unique (cost=45151.33..46951.33 rows=60 width=20) -> Sort (cost=45151.33..45601.33 rows=180000 width=20) Sort Key: a, b, (count(*) OVER (?)) -> WindowAgg (cost=1989.94..25746.96 rows=180000 width=20) -> Incremental Sort (cost=1989.94..22146.96 rows=180000 width=12) Sort Key: b, a, c Presorted Key: b -> Index Scan using b_idx on abcd (cost=0.29..7174.62 rows=180000 width=12) (8 rows) Note: Composite keys are also handled. create index xy_idx on xyz(x,y); explain select distinct x,z,y from xyz; QUERY PLAN ---------------------------------------------------------------------------------- Unique (cost=0.86..55.97 rows=60 width=12) -> Incremental Sort (cost=0.86..51.47 rows=600 width=12) Sort Key: x, y, z Presorted Key: x, y -> Index Scan using xy_idx on xyz (cost=0.15..32.80 rows=600 width=12) (5 rows) There are some cases where different kind of scan happens explain select distinct x,y from xyz where y < 10; QUERY PLAN ----------------------------------------------------------------------------------- Unique (cost=47.59..51.64 rows=60 width=8) -> Sort (cost=47.59..48.94 rows=540 width=8) Sort Key: x, y -> Bitmap Heap Scan on xyz (cost=12.34..23.09 rows=540 width=8) Recheck Cond: (y < 10) -> Bitmap Index Scan on y_idx (cost=0.00..12.20 rows=540 width=0) Index Cond: (y < 10) (7 rows) As code only checks from IndexPath (at the moment), other scan paths are not covered. Is it okay to cover these in same way as I did for IndexPath? (with no limitation on this behaviour on certain path types?) Also, I am assuming distinct pathkeys can be changed without any issues. As changes are limited to modification in distinct path only, I don't see this affecting other nodes. Test cases are green, with a couple of failures in window functions (one which I had added) and one very weird: EXPLAIN (COSTS OFF) SELECT DISTINCT empno, enroll_date, depname, sum(salary) OVER (PARTITION BY depname order by empno) depsalary, min(salary) OVER (PARTITION BY depname order by enroll_date) depminsalary FROM empsalary ORDER BY depname, empno; QUERY PLAN ----------------------------------------------------------------------------------------------------- Incremental Sort Sort Key: depname, empno Presorted Key: depname -> Unique -> Incremental Sort Sort Key: depname, enroll_date, empno, (sum(salary) OVER (?)), (min(salary) OVER (?)) Presorted Key: depname -> WindowAgg -> Incremental Sort Sort Key: depname, empno Presorted Key: depname -> WindowAgg -> Sort Sort Key: depname, enroll_date -> Seq Scan on empsalary (15 rows) In above query plan, unique used to come after Incremental sort in the master. Pending: 1. Consider if Pathkey.pk_strategy and pk_nulls_first need to be compared too, this is pending as I have to look these up and understand them. 2. Test cases (failures and new cases) 3. Improve comments Patch v8 attached. Please let me know any review comments, will address these in followup patch with pending items. Thanks, Ankit
Вложения
В списке pgsql-hackers по дате отправления:
Следующее
От: Tom LaneДата:
Сообщение: Re: UPDATE operation terminates logical replication receiver process due to an assertion