Обсуждение: pg16: XX000: could not find pathkey item to sort

Поиск
Список
Период
Сортировка

pg16: XX000: could not find pathkey item to sort

От
Justin Pryzby
Дата:
This fails since 1349d2790b

commit 1349d2790bf48a4de072931c722f39337e72055e
Author: David Rowley <drowley@postgresql.org>
Date:   Tue Aug 2 23:11:45 2022 +1200

    Improve performance of ORDER BY / DISTINCT aggregates

ts=# CREATE TABLE t (a int, b text) PARTITION BY RANGE (a);
ts=# CREATE TABLE td PARTITION OF t DEFAULT;
ts=# INSERT INTO t SELECT 1 AS a, '' AS b;
ts=# SET enable_partitionwise_aggregate=on;
ts=# explain SELECT a, COUNT(DISTINCT b) FROM t GROUP BY a;
ERROR:  XX000: could not find pathkey item to sort
LOCATION:  prepare_sort_from_pathkeys, createplan.c:6235

-- 
Justin



Re: pg16: XX000: could not find pathkey item to sort

От
Richard Guo
Дата:

On Mon, Sep 18, 2023 at 10:02 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
This fails since 1349d2790b

commit 1349d2790bf48a4de072931c722f39337e72055e
Author: David Rowley <drowley@postgresql.org>
Date:   Tue Aug 2 23:11:45 2022 +1200

    Improve performance of ORDER BY / DISTINCT aggregates

ts=# CREATE TABLE t (a int, b text) PARTITION BY RANGE (a);
ts=# CREATE TABLE td PARTITION OF t DEFAULT;
ts=# INSERT INTO t SELECT 1 AS a, '' AS b;
ts=# SET enable_partitionwise_aggregate=on;
ts=# explain SELECT a, COUNT(DISTINCT b) FROM t GROUP BY a;
ERROR:  XX000: could not find pathkey item to sort
LOCATION:  prepare_sort_from_pathkeys, createplan.c:6235

Thanks for the report!  I've looked at it a little bit.  In function
adjust_group_pathkeys_for_groupagg we add the pathkeys in ordered
aggregates to root->group_pathkeys.  But if the new added pathkeys do
not have EC members that match the targetlist or can be computed from
the targetlist, prepare_sort_from_pathkeys would have problem computing
sort column info for the new added pathkeys.  In the given example, the
pathkey representing 'b' can not match or be computed from the current
targetlist, so prepare_sort_from_pathkeys emits the error.

My first thought about the fix is that we artificially add resjunk
target entries to parse->targetList for the ordered aggregates'
arguments that are ORDER BY expressions, as attached.  While this can
fix the given query, it would cause Assert failure for the query in
sql/triggers.sql.

-- inserts only
insert into my_table values (1, 'AAA'), (2, 'BBB')
  on conflict (a) do
  update set b = my_table.b || ':' || excluded.b;

I haven't looked into how that happens.

Any thoughts?

Thanks
Richard
Вложения

Re: pg16: XX000: could not find pathkey item to sort

От
David Rowley
Дата:
On Tue, 19 Sept 2023 at 23:45, Richard Guo <guofenglinux@gmail.com> wrote:
> My first thought about the fix is that we artificially add resjunk
> target entries to parse->targetList for the ordered aggregates'
> arguments that are ORDER BY expressions, as attached.  While this can
> fix the given query, it would cause Assert failure for the query in
> sql/triggers.sql.

> Any thoughts?

Unfortunately, we can't do that as it'll lead to target entries
existing in the GroupAggregate's target list that have been
aggregated.

postgres=# explain verbose SELECT a, COUNT(DISTINCT b) FROM rp GROUP BY a;
                                      QUERY PLAN
--------------------------------------------------------------------------------------
 Append  (cost=88.17..201.39 rows=400 width=44)
   ->  GroupAggregate  (cost=88.17..99.70 rows=200 width=44)
         Output: rp.a, count(DISTINCT rp.b), rp.b

Your patch adds rp.b as an output column of the GroupAggregate.
Logically, that column cannot exist there as there is no correct
single value of rp.b after aggregation.

I think the fix needs to go into create_agg_path().  The problem is
that for AGG_SORTED we do:

if (aggstrategy == AGG_SORTED)
    pathnode->path.pathkeys = subpath->pathkeys; /* preserves order */

which assumes that all of the columns before the aggregate will be
available after the aggregate.  That likely used to work ok before
1349d2790 as the planner wouldn't have requested any Pathkeys for
columns that were not available below the Agg node.

We can no longer take the subpath pathkey's verbatim. We need to strip
off pathkeys for columns that are not in pathnode's targetlist.

I've attached a patch which adds a new function to pathkeys.c to strip
off any PathKeys in a list that don't have a corresponding item in the
given PathTarget and just return a prefix of the input pathkey list up
until the first expr that can't be found.

I'm concerned that this patch will be too much overhead when creating
paths when a PathKey's EquivalenceClass has a large number of members
from partitioned tables.  I wondered if we should instead just check
if the subpath's pathkeys match root->group_pathkeys and if they do
set the AggPath's pathkeys to list_copy_head(subpath->pathkeys,
root->num_groupby_pathkeys),  that'll be much cheaper, but it just
feels a bit too much like a special case.

David

Вложения

Re: pg16: XX000: could not find pathkey item to sort

От
David Rowley
Дата:
On Tue, 3 Oct 2023 at 09:11, David Rowley <dgrowleyml@gmail.com> wrote:
> I'm concerned that this patch will be too much overhead when creating
> paths when a PathKey's EquivalenceClass has a large number of members
> from partitioned tables.

I just tried out the patch to see how much it affects the performance
of the planner.  I think we need to find a better way to strip off the
pathkeys for the columns that have been aggregated.

Setup:
create table lp (a int, b int) partition by list (a);
select 'create table lp'||x||' partition of lp for values in('||x||')'
from generate_series(0,999)x;
\gexec

\pset pager off
set enable_partitionwise_aggregate=1;

Benchmark query:
explain (summary on) select a,count(*) from lp group by a;

Master:
Planning Time: 23.945 ms
Planning Time: 23.887 ms
Planning Time: 23.927 ms

perf top:
   7.39%  libc.so.6         [.] __memmove_avx_unaligned_erms
   6.98%  [kernel]          [k] clear_page_rep
   5.69%  postgres          [.] bms_is_subset
   5.07%  postgres          [.] fetch_upper_rel
   4.41%  postgres          [.] bms_equal

Patched:
Planning Time: 41.410 ms
Planning Time: 41.474 ms
Planning Time: 41.488 ms

perf top:
 19.02%  postgres          [.] bms_is_subset
   6.91%  postgres          [.] find_ec_member_matching_expr
   5.93%  libc.so.6         [.] __memmove_avx_unaligned_erms
   5.55%  [kernel]          [k] clear_page_rep
   4.07%  postgres          [.] fetch_upper_rel
   3.46%  postgres          [.] bms_equal

> I wondered if we should instead just check
> if the subpath's pathkeys match root->group_pathkeys and if they do
> set the AggPath's pathkeys to list_copy_head(subpath->pathkeys,
> root->num_groupby_pathkeys),  that'll be much cheaper, but it just
> feels a bit too much like a special case.

I tried this approach (patch attached) and it does perform better than
the other patch:

create_agg_path_fix2.patch:
Planning Time: 24.357 ms
Planning Time: 24.293 ms
Planning Time: 24.259 ms

   7.45%  libc.so.6         [.] __memmove_avx_unaligned_erms
   6.90%  [kernel]          [k] clear_page_rep
   5.56%  postgres          [.] bms_is_subset
   5.38%  postgres          [.] bms_equal

I wonder if the attached patch is too much of a special case fix.  I
guess from the lack of complaints previously that there are no other
cases where we could possibly have pathkeys that belong to columns
that are aggregated.  I've not gone to much effort to see if I can
craft a case that hits this without the ORDER BY/DISTINCT aggregate
optimisation, however.

David

Вложения

Re: pg16: XX000: could not find pathkey item to sort

От
David Rowley
Дата:
On Tue, 3 Oct 2023 at 20:16, David Rowley <dgrowleyml@gmail.com> wrote:
> I wonder if the attached patch is too much of a special case fix.  I
> guess from the lack of complaints previously that there are no other
> cases where we could possibly have pathkeys that belong to columns
> that are aggregated.  I've not gone to much effort to see if I can
> craft a case that hits this without the ORDER BY/DISTINCT aggregate
> optimisation, however.

I spent more time on this today.  I'd been wondering if there was any
reason why create_agg_path() would receive a subpath with pathkeys
that were anything but the PlannerInfo's group_pathkeys.  I mean, how
could we do Group Aggregate if it wasn't?  I wondered if grouping sets
might change that, but it seems the group_pathkeys will be set to the
initial grouping set.

Given that, it would seem it's safe to just trim off any pathkey that
was added to the group_pathkeys by
adjust_group_pathkeys_for_groupagg().
PlannerInfo.num_groupby_pathkeys marks the number of pathkeys that
existed in group_pathkeys before adjust_group_pathkeys_for_groupagg()
made any additions, so we can just trim the list length back to that.

I've done this in the attached patch.   I also considered if it was
worth adding a regression test for this and I concluded that there are
better ways to test for this and considered if we should add some code
to createplan.c to check that all Path pathkeys have corresponding
items in the PathTarget.  I've included an additional patch which adds
some code in USE_ASSERT_CHECKING builds to verify this.  Without the
fix it's simple enough to trigger this with a query such as:

select two,count(distinct four) from tenk1 group by two order by two;

Without the fix the additional asserts cause the regression tests to
fail, but with the fix everything passes.

Justin's case is quite an obscure way to hit this as it requires
partitionwise aggregation plus a single partition so that the Append
is removed due to only having a single subplan in setrefs.c.  If there
had been 2 partitions, then the AppendPath wouldn't have inherited the
subpath's pathkeys per code at the end of create_append_path().

So in short, I propose the attached fix without any regression tests
because I feel that any regression test would just mark that there was
a big in create_agg_path() and not really help with ensuring we don't
end up with some similar problem in the future.

I have some concerns that the assert_pathkeys_in_target() function
might be a little heavyweight for USE_ASSERT_CHECKING builds. So I'm
not proposing to commit that without further discussion.

Does anyone feel differently?

If not, I plan to push the attached
strip_aggregate_pathkeys_from_aggpaths_v2.patch early next week.

David

Вложения

Re: pg16: XX000: could not find pathkey item to sort

От
Richard Guo
Дата:

On Thu, Oct 5, 2023 at 2:26 PM David Rowley <dgrowleyml@gmail.com> wrote:
So in short, I propose the attached fix without any regression tests
because I feel that any regression test would just mark that there was
a big in create_agg_path() and not really help with ensuring we don't
end up with some similar problem in the future.

If the pathkeys that were added by adjust_group_pathkeys_for_groupagg()
are computable from the targetlist, it seems that we do not need to trim
them off, because prepare_sort_from_pathkeys() will add resjunk target
entries for them.  But it's also no harm if we trim them off.  So I
think the patch is a pretty safe fix.  +1 to it.
 
I have some concerns that the assert_pathkeys_in_target() function
might be a little heavyweight for USE_ASSERT_CHECKING builds. So I'm
not proposing to commit that without further discussion.

Yeah, it looks like some heavy to call assert_pathkeys_in_target() for
each path node.  Can we run some benchmarks to see how much overhead it
would bring to USE_ASSERT_CHECKING build?

Thanks
Richard

Re: pg16: XX000: could not find pathkey item to sort

От
David Rowley
Дата:
On Sun, 8 Oct 2023 at 23:52, Richard Guo <guofenglinux@gmail.com> wrote:
> On Thu, Oct 5, 2023 at 2:26 PM David Rowley <dgrowleyml@gmail.com> wrote:
>>
>> So in short, I propose the attached fix without any regression tests
>> because I feel that any regression test would just mark that there was
>> a big in create_agg_path() and not really help with ensuring we don't
>> end up with some similar problem in the future.
>
>
> If the pathkeys that were added by adjust_group_pathkeys_for_groupagg()
> are computable from the targetlist, it seems that we do not need to trim
> them off, because prepare_sort_from_pathkeys() will add resjunk target
> entries for them.  But it's also no harm if we trim them off.  So I
> think the patch is a pretty safe fix.  +1 to it.

hmm, I think one of us does not understand what is going on here. I
tried to explain in [1] why we *need* to strip off the pathkeys added
by adjust_group_pathkeys_for_groupagg().

Given the following example:

create table ab (a int,b int);
explain (costs off) select a,count(distinct b) from ab group by a;
         QUERY PLAN
----------------------------
 GroupAggregate
   Group Key: a
   ->  Sort
         Sort Key: a, b
         ->  Seq Scan on ab
(5 rows)

adjust_group_pathkeys_for_groupagg() will add the pathkey for the "b"
column and that results in the Sort node sorting on {a,b}.  It's
simply not at all valid to have the GroupAggregate path claim that its
pathkeys are also (effectively) {a,b}" as "b" does not and cannot
legally exist after the aggregation takes place.  We cannot put a
resjunk "b" in the targetlist of the GroupAggregate either as there
could be any number "b" values aggregated.

Can you explain why you think we can put a resjunk "b" in the target
list of the GroupAggregate in the above case?

>>
>> I have some concerns that the assert_pathkeys_in_target() function
>> might be a little heavyweight for USE_ASSERT_CHECKING builds. So I'm
>> not proposing to commit that without further discussion.
>
>
> Yeah, it looks like some heavy to call assert_pathkeys_in_target() for
> each path node.  Can we run some benchmarks to see how much overhead it
> would bring to USE_ASSERT_CHECKING build?

I think it'll be easy to show that there is an overhead to it.  It'll
be in the realm of the ~41ms patched vs ~24ms unpatched that I showed
in [2].  That's quite an extreme case.

Maybe it's worth checking the total planning time spent in a run of
the regression tests with and without the patch to see how much
overhead it adds to the "average case".

David

[1] https://postgr.es/m/CAApHDvpJJigQRW29TppTOPYp+Aui0mtd3MpfRxyKv=N-tB62jQ@mail.gmail.com
[2] https://postgr.es/m/CAApHDvo7RzcQYw-gnkZr6QCijCqf8vJLkJ4XFk-KawvyAw109Q@mail.gmail.com



Re: pg16: XX000: could not find pathkey item to sort

От
Richard Guo
Дата:

On Mon, Oct 9, 2023 at 7:42 AM David Rowley <dgrowleyml@gmail.com> wrote:
On Sun, 8 Oct 2023 at 23:52, Richard Guo <guofenglinux@gmail.com> wrote:
> If the pathkeys that were added by adjust_group_pathkeys_for_groupagg()
> are computable from the targetlist, it seems that we do not need to trim
> them off, because prepare_sort_from_pathkeys() will add resjunk target
> entries for them.  But it's also no harm if we trim them off.  So I
> think the patch is a pretty safe fix.  +1 to it.

hmm, I think one of us does not understand what is going on here. I
tried to explain in [1] why we *need* to strip off the pathkeys added
by adjust_group_pathkeys_for_groupagg().

Sorry I didn't make myself clear.  I understand why we need to trim off
the pathkeys added by adjust_group_pathkeys_for_groupagg().  What I
meant was that if the new added pathkeys are *computable* from the
existing target entries, then prepare_sort_from_pathkeys() will add
resjunk target entries for them, so there seems to be no problem even if
we do not trim them off.  For example

explain (verbose, costs off)
select a, count(distinct a+1) from prt1 group by a order by a;
                                     QUERY PLAN
------------------------------------------------------------------------------------
 Result
   Output: prt1.a, (count(DISTINCT ((prt1.a + 1))))
   ->  Merge Append
         Sort Key: prt1.a, ((prt1.a + 1))
         ->  GroupAggregate
               Output: prt1.a, count(DISTINCT ((prt1.a + 1))), ((prt1.a + 1))
               Group Key: prt1.a
               ->  Sort
                     Output: prt1.a, ((prt1.a + 1))
                     Sort Key: prt1.a, ((prt1.a + 1))
                     ->  Seq Scan on public.prt1_p1 prt1
                           Output: prt1.a, (prt1.a + 1)
    ...

Expression 'a+1' is *computable* from the existing entry 'a', so we just
add a new resjunk target entry for 'a+1', and there is no error planning
this query.  But if we change 'a+1' to something that is not computable,
then we would have problems (without your fix), and the reason has been
well explained by your messages.

explain (verbose, costs off)
select a, count(distinct b) from prt1 group by a order by a;
ERROR:  could not find pathkey item to sort

Having said that, I think it's the right thing to do to trim off the new
added pathkeys, even if they are *computable*.  In the plan above, the
'(prt1.a + 1)' in GroupAggregate's targetlist and MergeAppend's
pathkeys are actually redundant.  It's good to remove it.
 
Can you explain why you think we can put a resjunk "b" in the target
list of the GroupAggregate in the above case?

Hmm, I don't think we can do that, because 'b' is not *computable* from
the existing target entries, as I explained above.

Thanks
Richard

Re: pg16: XX000: could not find pathkey item to sort

От
David Rowley
Дата:
On Mon, 9 Oct 2023 at 12:42, David Rowley <dgrowleyml@gmail.com> wrote:
> Maybe it's worth checking the total planning time spent in a run of
> the regression tests with and without the patch to see how much
> overhead it adds to the "average case".

I've now pushed the patch that trims off the Pathkeys for the ORDER BY
/ DISTINCT aggregates.

As for the patch to verify the pathkeys during create plan, I patched
master with the attached plan_times.patch.txt and used the following
to check the time spent in the planner for 3 runs of make
installcheck.

$ for i in {1..3}; do pg_ctl start -D pgdata -l plantime.log >
/dev/null && cd pg_src && make installcheck > /dev/null && cd .. &&
grep "planning time in" plantime.log|sed -E -e 's/.*planning time in
(.*) nanoseconds/\1/'|awk '{nanoseconds += $1} END{print nanoseconds}'
&& pg_ctl stop -D pgdata > /dev/null && rm plantime.log; done

Master:
1855788104
1839655412
1740769066

Patched:
1917797221
1766606115
1881322655

Those results are a bit noisy.  Perhaps a few more runs might yield
more consistency, but it seems that there's not too much overhead to
it. If I take the minimum value out of the 3 runs from each, it comes
to about 1.5% extra time spent in planning.  Perhaps that's OK.

David

Вложения

Re: pg16: XX000: could not find pathkey item to sort

От
Richard Guo
Дата:

On Mon, Oct 9, 2023 at 12:13 PM David Rowley <dgrowleyml@gmail.com> wrote:
I've now pushed the patch that trims off the Pathkeys for the ORDER BY
/ DISTINCT aggregates.

Thanks for pushing!
 
Those results are a bit noisy.  Perhaps a few more runs might yield
more consistency, but it seems that there's not too much overhead to
it. If I take the minimum value out of the 3 runs from each, it comes
to about 1.5% extra time spent in planning.  Perhaps that's OK.

I agree that the overhead is acceptable, especially it only happens in
USE_ASSERT_CHECKING builds.

Thanks
Richard

Re: pg16: XX000: could not find pathkey item to sort

От
Alexander Lakhin
Дата:
Hello David,

09.10.2023 07:13, David Rowley wrote:
> On Mon, 9 Oct 2023 at 12:42, David Rowley <dgrowleyml@gmail.com> wrote:
>> Maybe it's worth checking the total planning time spent in a run of
>> the regression tests with and without the patch to see how much
>> overhead it adds to the "average case".
> I've now pushed the patch that trims off the Pathkeys for the ORDER BY
> / DISTINCT aggregates.
>

I've stumbled upon the same error, but this time it apparently has another
cause. It can be produced (on REL_16_STABLE and master) as follows:
CREATE TABLE t (a int, b int) PARTITION BY RANGE (a);
CREATE TABLE td PARTITION OF t DEFAULT;
CREATE TABLE tp1 PARTITION OF t FOR VALUES FROM (1) TO (2);
SET enable_partitionwise_aggregate = on;
SET parallel_setup_cost = 0;
SELECT a, sum(b order by b) FROM t GROUP BY a ORDER BY a;

ERROR:  could not find pathkey item to sort

`git bisect` for this anomaly blames the same commit 1349d2790.

Best regards,
Alexander



Re: pg16: XX000: could not find pathkey item to sort

От
David Rowley
Дата:
On Thu, 14 Mar 2024 at 06:00, Alexander Lakhin <exclusion@gmail.com> wrote:
> I've stumbled upon the same error, but this time it apparently has another
> cause. It can be produced (on REL_16_STABLE and master) as follows:
> CREATE TABLE t (a int, b int) PARTITION BY RANGE (a);
> CREATE TABLE td PARTITION OF t DEFAULT;
> CREATE TABLE tp1 PARTITION OF t FOR VALUES FROM (1) TO (2);
> SET enable_partitionwise_aggregate = on;
> SET parallel_setup_cost = 0;
> SELECT a, sum(b order by b) FROM t GROUP BY a ORDER BY a;
>
> ERROR:  could not find pathkey item to sort
>
> `git bisect` for this anomaly blames the same commit 1349d2790.

Thanks for finding and for the recreator script.

I've attached a patch which fixes the problem for me.

On debugging this I uncovered some other stuff that looks broken which
seems to caused by partition-wise aggregates.  With your example
query, in get_useful_pathkeys_for_relation(), we call
relation_can_be_sorted_early() to check if the pathkey can be used as
a set of pathkeys in useful_pathkeys_list.  The problem is that in
your query the 'rel' is the base relation belonging to the partitioned
table and relation_can_be_sorted_early() looks through the targetlist
for that relation and finds columns "a" and "b" in there.  The problem
is "b" has been aggregated away as partial aggregation has taken place
due to the partition-wise aggregation. I believe whichever rel we
should be using there should have an Aggref in the target exprs rather
than the plain unaggregated column.  I've added Robert and Ashutosh to
see what their thoughts are on this.

David

Вложения

Re: pg16: XX000: could not find pathkey item to sort

От
Ashutosh Bapat
Дата:


On Thu, Mar 14, 2024 at 4:30 AM David Rowley <dgrowleyml@gmail.com> wrote:
On Thu, 14 Mar 2024 at 06:00, Alexander Lakhin <exclusion@gmail.com> wrote:
> I've stumbled upon the same error, but this time it apparently has another
> cause. It can be produced (on REL_16_STABLE and master) as follows:
> CREATE TABLE t (a int, b int) PARTITION BY RANGE (a);
> CREATE TABLE td PARTITION OF t DEFAULT;
> CREATE TABLE tp1 PARTITION OF t FOR VALUES FROM (1) TO (2);
> SET enable_partitionwise_aggregate = on;
> SET parallel_setup_cost = 0;
> SELECT a, sum(b order by b) FROM t GROUP BY a ORDER BY a;
>
> ERROR:  could not find pathkey item to sort
>
> `git bisect` for this anomaly blames the same commit 1349d2790.

Thanks for finding and for the recreator script.

I've attached a patch which fixes the problem for me.

On debugging this I uncovered some other stuff that looks broken which
seems to caused by partition-wise aggregates.  With your example
query, in get_useful_pathkeys_for_relation(), we call
relation_can_be_sorted_early() to check if the pathkey can be used as
a set of pathkeys in useful_pathkeys_list.  The problem is that in
your query the 'rel' is the base relation belonging to the partitioned
table and relation_can_be_sorted_early() looks through the targetlist
for that relation and finds columns "a" and "b" in there. The problem
is "b" has been aggregated away as partial aggregation has taken place
due to the partition-wise aggregation. I believe whichever rel we
should be using there should have an Aggref in the target exprs rather
than the plain unaggregated column.  I've added Robert and Ashutosh to
see what their thoughts are on this.

I don't understand why root->query_pathkeys has both a and b. "a" is there because of GROUP BY and ORDER BY clause. But why "b"?

Under the debugger this is what I observed: generate_useful_gather_paths() gets called twice, once for the base relation and second time for the upper relation.

When it's called for base relation, it includes "a" and "b" both in the useful pathkeys. The plan doesn't use sortedness on b. But I don't think that's the problem of the relation used. It looks like root->query_pathkeys containing "b" may be a problem.

When it's called for upper relation, the reltarget has "a" and Aggref() and it includes only "a" in the useful pathkeys which is as per your expectation.

--
Best Wishes,
Ashutosh Bapat

Re: pg16: XX000: could not find pathkey item to sort

От
David Rowley
Дата:
On Thu, 14 Mar 2024 at 18:23, Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
> I don't understand why root->query_pathkeys has both a and b. "a" is there because of GROUP BY and ORDER BY clause.
Butwhy "b"?
 

So that the ORDER BY aggregate function can be evaluated without
nodeAgg.c having to perform the sort. See
adjust_group_pathkeys_for_groupagg().

David



Re: pg16: XX000: could not find pathkey item to sort

От
David Rowley
Дата:
On Thu, 14 Mar 2024 at 12:00, David Rowley <dgrowleyml@gmail.com> wrote:
> I've attached a patch which fixes the problem for me.

I've pushed the patch to fix gather_grouping_paths().  The issue with
the RelOptInfo having the incorrect PathTarget->exprs after the
partial phase of partition-wise aggregate remains.

David



Re: pg16: XX000: could not find pathkey item to sort

От
Ashutosh Bapat
Дата:


On Thu, Mar 14, 2024 at 3:45 PM David Rowley <dgrowleyml@gmail.com> wrote:
On Thu, 14 Mar 2024 at 18:23, Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
> I don't understand why root->query_pathkeys has both a and b. "a" is there because of GROUP BY and ORDER BY clause. But why "b"?

So that the ORDER BY aggregate function can be evaluated without
nodeAgg.c having to perform the sort. See
adjust_group_pathkeys_for_groupagg().

Thanks. To me, it looks like we are gathering pathkeys, which if used to sort the result of overall join, would avoid sorting in as many as aggregates as possible.

relation_can_be_sorted_early() finds, pathkeys which if used to sort the given relation, would help sorting the overall join. Contrary to what I said earlier, it might help if the base relation is sorted on "a" and "b". What I find weird is that the sorting is not pushed down to the partitions, where it would help most.

#explain verbose SELECT a, sum(b order by b) FROM t GROUP BY a ORDER BY a;
                                     QUERY PLAN                                    
------------------------------------------------------------------------------------
 GroupAggregate  (cost=362.21..398.11 rows=200 width=12)
   Output: t.a, sum(t.b ORDER BY t.b)
   Group Key: t.a
   ->  Sort  (cost=362.21..373.51 rows=4520 width=8)
         Output: t.a, t.b
         Sort Key: t.a, t.b
         ->  Append  (cost=0.00..87.80 rows=4520 width=8)
               ->  Seq Scan on public.tp1 t_1  (cost=0.00..32.60 rows=2260 width=8)
                     Output: t_1.a, t_1.b
               ->  Seq Scan on public.td t_2  (cost=0.00..32.60 rows=2260 width=8)
                     Output: t_2.a, t_2.b
(11 rows)

and that's the case even without parallel plans

#explain verbose SELECT a, sum(b order by b) FROM t GROUP BY a ORDER BY a;
                                     QUERY PLAN                                    
------------------------------------------------------------------------------------
 GroupAggregate  (cost=362.21..398.11 rows=200 width=12)
   Output: t.a, sum(t.b ORDER BY t.b)
   Group Key: t.a
   ->  Sort  (cost=362.21..373.51 rows=4520 width=8)
         Output: t.a, t.b
         Sort Key: t.a, t.b
         ->  Append  (cost=0.00..87.80 rows=4520 width=8)
               ->  Seq Scan on public.tp1 t_1  (cost=0.00..32.60 rows=2260 width=8)
                     Output: t_1.a, t_1.b
               ->  Seq Scan on public.td t_2  (cost=0.00..32.60 rows=2260 width=8)
                     Output: t_2.a, t_2.b
(11 rows)

But it could be just because the corresponding plan was not found to be optimal. May be because there isn't enough data in those tables.

If the problem you speculate is different from this one, I am not able to see it. It might help give an example query or explain more.

--
Best Wishes,
Ashutosh Bapat

Re: pg16: XX000: could not find pathkey item to sort

От
David Rowley
Дата:
On Mon, 18 Mar 2024 at 18:50, Ashutosh Bapat
<ashutosh.bapat.oss@gmail.com> wrote:
> If the problem you speculate is different from this one, I am not able to see it. It might help give an example query
orexplain more.
 

I looked at this again and I might have been wrong about there being a
problem.  I set a breakpoint in create_gather_merge_path() and
adjusted the startup and total cost to 1 when I saw the pathkeys
containing {a,b}.  It turns out this is the non-partitionwise
aggregate path, and of course, the targetlist there does contain the
"b" column, so it's fine in that case that the pathkeys are {a,b}.   I
had previously thought that this was for the partition-wise aggregate
plan, in which case the targetlist would contain a, sum(b order by b),
of which there's no single value of "b" that we can legally sort by.

Here's the full plan.

postgres=# explain verbose SELECT a, sum(b order by b) FROM t GROUP BY
a ORDER BY a;
                                            QUERY PLAN
---------------------------------------------------------------------------------------------------
 GroupAggregate  (cost=1.00..25.60 rows=200 width=12)
   Output: t.a, sum(t.b ORDER BY t.b)
   Group Key: t.a
   ->  Gather Merge  (cost=1.00..1.00 rows=4520 width=8)
         Output: t.a, t.b
         Workers Planned: 2
         ->  Sort  (cost=158.36..163.07 rows=1882 width=8)
               Output: t.a, t.b
               Sort Key: t.a, t.b
               ->  Parallel Append  (cost=0.00..56.00 rows=1882 width=8)
                     ->  Parallel Seq Scan on public.tp1 t_1
(cost=0.00..23.29 rows=1329 width=8)
                           Output: t_1.a, t_1.b
                     ->  Parallel Seq Scan on public.td t_2
(cost=0.00..23.29 rows=1329 width=8)
                           Output: t_2.a, t_2.b
(14 rows)

David