Re: Fix incorrect start up costs for WindowAgg paths (bug #17862)

Поиск
Список
Период
Сортировка
От Andy Fan
Тема Re: Fix incorrect start up costs for WindowAgg paths (bug #17862)
Дата
Msg-id CAKU4AWo0K+85q-R8Hs_r9n47CV6_m6-GG_hCFCJcqhmBCbRDhw@mail.gmail.com
обсуждение исходный текст
Ответ на Re: Fix incorrect start up costs for WindowAgg paths (bug #17862)  (David Rowley <dgrowleyml@gmail.com>)
Список pgsql-hackers


On Thu, Apr 13, 2023 at 6:09 AM David Rowley <dgrowleyml@gmail.com> wrote:
.On Thu, 13 Apr 2023 at 02:28, Andy Fan <zhihui.fan1213@gmail.com> wrote:
> The concept of startup_tuples for a WindowAgg looks good to me, but I
> can't follow up with the below line:
>
> + return clamp_row_est(partition_tuples * DEFAULT_INEQ_SEL);
>
> # select count(*) over() from tenk1 limit 1;
>  count
> -------
>  10000  -->  We need to scan all the tuples.
>
> Should we just return clamp_row_est(partition_tuples)?

For the case you've shown, it will.  It's handled by this code:

if (wc->orderClause == NIL)
    return clamp_row_est(partition_tuples);

My fault.  I should have real debugging to double check my
understanding, surely I will next time. 

It would take something like the following to hit the code you're
concerned about:

explain select count(*) over(order by unique1 rows between unbounded
preceding and 10*random() following) from tenk1;
                                         QUERY PLAN
--------------------------------------------------------------------------------------------
 WindowAgg  (cost=140.23..420.29 rows=10000 width=12)
   ->  Index Only Scan using tenk1_unique1 on tenk1
(cost=0.29..270.29 rows=10000 width=4)
(2 rows)

You can see the startup cost is about 33% of the total cost for that,
which is from the DEFAULT_INEQ_SEL.  I'm not exactly set on that
having to be DEFAULT_INEQ_SEL, but I'm not really sure what we could
put that's better. I don't really follow why assuming all rows are
required is better.  That'll just mean we favour cheap startup plans
less, but there might be a case where a cheap startup plan is
favourable. I was opting for a happy medium when I thought to use
DEFAULT_INEQ_SEL.

That looks reasonable to me.  My suggestion came from my misreading
before,  It was a bit late in my time zone when writing. Thanks for the
detailed explanation! 
 

I also see I might need to do a bit more work on this as the following
is not handled correctly:

select count(*) over(rows between unbounded preceding and 10
following) from tenk1;

it's assuming all rows due to lack of ORDER BY, but it seems like it
should be 10 rows due to the 10 FOLLOWING end bound.


True to me.   


--
Best Regards
Andy Fan

В списке pgsql-hackers по дате отправления:

Предыдущее
От: Daniel Gustafsson
Дата:
Сообщение: Re: [PATCH] Add `verify-system` sslmode to use system CA pool for server cert
Следующее
От: Michael Paquier
Дата:
Сообщение: Clean up hba.c of code freeing regexps