Обсуждение: Introduce list_reverse() to make lcons() usage less inefficient

Поиск
Список
Период
Сортировка

Introduce list_reverse() to make lcons() usage less inefficient

От
David Rowley
Дата:
While working on [1] to make improvements in the query planner around
the speed to find EquivalenceMembers in an EquivalenceClass, because
that patch does have a large impact in terms of performance
improvements, some performance tests with that patch started to
highlight some other places that bottleneck the planner's performance.

One of those places is in generate_orderedappend_paths() when we find
that the required sort order is the same as the reverse of the
partition order.  In this case, we build a list of paths for each
partition using the lcons() function. Since Lists are now arrays in
PostgreSQL, lcons() isn't as efficient as it once was and it must
memmove the entire existing contents of the list up one element to
make way to prepend the new element. This is effectively quadratic and
becomes noticeable with a large number of partitions.

One way we could solve that is to just lappend() the new item and then
just reverse the order of the list only when we need to.  This has the
added advantage of removing a bunch of semi-duplicated code from
generate_orderedappend_paths(). It also has a noticeable impact on the
planner's performance.

I did a quick test with:

create table lp (a int, b int) partition by list(a);
select 'create table lp'||x::text||' partition of lp for values
in('||x::text||');' from generate_Series(1,10000)x;
\gexec
create index on lp(a);

Using: psql -c "explain (analyze, timing off) select * from lp order
by a desc" postgres | grep "Planning Time"

master:
Planning Time: 6034.765 ms
Planning Time: 5919.914 ms
Planning Time: 5720.529 ms

master + eclass idx (from [1]) (yes, it really is this much faster)
Planning Time: 549.262 ms
Planning Time: 489.023 ms
Planning Time: 497.803 ms

master + eclass idx + list_reverse (attached)
Planning Time: 517.067 ms
Planning Time: 463.613 ms
Planning Time: 463.036 ms

I suspect there won't be much controversy here and there's certainly
not much complexity, so in absence of anyone voicing an opinion here,
I'm inclined to not waste too much time on this one and just get it
done. I'll leave it for a few days.

David

[1] https://postgr.es/m/flat/CAJ2pMkZNCgoUKSE+_5LthD+KbXKvq6h2hQN8Esxpxd+cxmgomg@mail.gmail.com

Вложения

Re: Introduce list_reverse() to make lcons() usage less inefficient

От
Andres Freund
Дата:
Hi,

On 2023-02-17 11:36:40 +1300, David Rowley wrote:
> While working on [1] to make improvements in the query planner around
> the speed to find EquivalenceMembers in an EquivalenceClass, because
> that patch does have a large impact in terms of performance
> improvements, some performance tests with that patch started to
> highlight some other places that bottleneck the planner's performance.
> 
> One of those places is in generate_orderedappend_paths() when we find
> that the required sort order is the same as the reverse of the
> partition order.  In this case, we build a list of paths for each
> partition using the lcons() function. Since Lists are now arrays in
> PostgreSQL, lcons() isn't as efficient as it once was and it must
> memmove the entire existing contents of the list up one element to
> make way to prepend the new element. This is effectively quadratic and
> becomes noticeable with a large number of partitions.

I have wondered before if we eventually ought to switch to embedded lists for
some planner structures, including paths. add_path() inserts/deletes at points
in the middle of the list, which isn't great.


> One way we could solve that is to just lappend() the new item and then
> just reverse the order of the list only when we need to.

That's not generally the same as lcons() ing, but I guess it's fine here,
because we build the lists from scratch, so the reversing actually yields the
correct result.

But wouldn't an even cheaper way here be to iterate over the children in
reverse order when match_partition_order_desc? We can do that efficiently
now. Looks like we don't have a readymade helper for it, but it'd be easy
enough to add or open code.

Greetings,

Andres Freund



Re: Introduce list_reverse() to make lcons() usage less inefficient

От
Tom Lane
Дата:
Andres Freund <andres@anarazel.de> writes:
> On 2023-02-17 11:36:40 +1300, David Rowley wrote:
>> One of those places is in generate_orderedappend_paths() when we find
>> that the required sort order is the same as the reverse of the
>> partition order.  In this case, we build a list of paths for each
>> partition using the lcons() function. Since Lists are now arrays in
>> PostgreSQL, lcons() isn't as efficient as it once was and it must
>> memmove the entire existing contents of the list up one element to
>> make way to prepend the new element. This is effectively quadratic and
>> becomes noticeable with a large number of partitions.

> I have wondered before if we eventually ought to switch to embedded lists for
> some planner structures, including paths. add_path() inserts/deletes at points
> in the middle of the list, which isn't great.

I'm not hugely excited about that, because it presumes that paths appear
in only one list, which isn't true.  We could perhaps privilege
RelOptInfo.pathlist over other cases, but that'd be asymmetrical and
probably bug-inducing.

            regards, tom lane



Re: Introduce list_reverse() to make lcons() usage less inefficient

От
David Rowley
Дата:
On Fri, 17 Feb 2023 at 13:23, Andres Freund <andres@anarazel.de> wrote:
> But wouldn't an even cheaper way here be to iterate over the children in
> reverse order when match_partition_order_desc? We can do that efficiently
> now. Looks like we don't have a readymade helper for it, but it'd be easy
> enough to add or open code.

That seems fair.  I think open coding is a better option.  I had a go
at foreach_reverse recently and decided to keep clear of it due to
behavioural differences with foreach_delete_current().

I've attached a patch for this.  It seems to have similar performance
to the list_reverse()

$ psql -c "explain (analyze, timing off) select * from lp order by a
desc" postgres | grep "Planning Time"
 Planning Time: 522.554 ms <- cold relcache
 Planning Time: 467.776 ms
 Planning Time: 466.424 ms

David

Вложения

Re: Introduce list_reverse() to make lcons() usage less inefficient

От
David Rowley
Дата:
On Fri, 17 Feb 2023 at 16:35, David Rowley <dgrowleyml@gmail.com> wrote:
>
> On Fri, 17 Feb 2023 at 13:23, Andres Freund <andres@anarazel.de> wrote:
> > But wouldn't an even cheaper way here be to iterate over the children in
> > reverse order when match_partition_order_desc? We can do that efficiently
> > now. Looks like we don't have a readymade helper for it, but it'd be easy
> > enough to add or open code.
>
> That seems fair.  I think open coding is a better option.  I had a go
> at foreach_reverse recently and decided to keep clear of it due to
> behavioural differences with foreach_delete_current().

I've pushed a patch for this now. Thank you for the idea.

David