Обсуждение: Have the planner convert COUNT(1) / COUNT(not_null_col) to COUNT(*)

Поиск

Список

Период

Сортировка

Have the planner convert COUNT(1) / COUNT(not_null_col) to COUNT(*)

От

David Rowley

Дата:

25 октября, 04:39:39

Since e2debb643, we've had the ability to determine if a column is
NULLable as early as during constant folding.  This seems like a good
time to consider converting COUNT(not_null_col) into COUNT(*), which
is faster and may result in far fewer columns being deformed from the
tuple.

To make this work, I invented "SupportRequestSimplifyAggref", which is
similar to the existing SupportRequestSimplify, which is for
FuncExprs.  Aggregates use Aggrefs, so we need something else.

It's easy to see that count(*) is faster. Here's a quick test in an
unpatched master:

create table t (a int, b int, c int, d int, e int, f int, g int, h int
not null);
insert into t (h) select 1 from generate_Series(1,1000000);
vacuum freeze t;

master:
select count(h) from t;
Time: 16.442 ms
Time: 16.255 ms
Time: 16.322 ms

master:
select count(*) from t;
Time: 12.203 ms
Time: 11.402 ms
Time: 12.054 ms (+37%)

With the patch applied, both queries will perform the same.

It may be possible to apply transformations to other aggregate
functions too, but I don't want to discuss that here. I mostly want to
determine if the infrastructure is ok and do the count(*) one because
it seems like the most likely one to be useful.

One thing I wasn't too sure about was if we should make it possible
for the support function to return something that's not an Aggref.  In
theory, something like COUNT(NULL) could just return '0'::bigint.
While that does seem an optimisation that wouldn't be applied very
often, I have opted to leave it so that such an optimisation *could*
be done by the support function. I also happen to test that that
doesn't entirely break the query, as ordinarily it would if we didn't
have Query.hasAggs (It's not too dissimilar to removing unused columns
from a subquery)

Should we do this?

David

Вложения

v1-0001-Have-the-planner-replace-COUNT-ANY-with-COUNT-whe.patch

Re: Have the planner convert COUNT(1) / COUNT(not_null_col) to COUNT(*)

От

Corey Huinker

Дата:

30 октября, 06:40:45

It may be possible to apply transformations to other aggregate
functions too, but I don't want to discuss that here. I mostly want to
determine if the infrastructure is ok and do the count(*) one because
it seems like the most likely one to be useful.

One thing I wasn't too sure about was if we should make it possible
for the support function to return something that's not an Aggref. In
theory, something like COUNT(NULL) could just return '0'::bigint.
While that does seem an optimisation that wouldn't be applied very
often, I have opted to leave it so that such an optimisation *could*
be done by the support function. I also happen to test that that
doesn't entirely break the query, as ordinarily it would if we didn't
have Query.hasAggs (It's not too dissimilar to removing unused columns
from a subquery)

Should we do this?

David

Automatic query improvements are a Good Thing. We're never going to educate everyone that COUNT(1) is an anti-pattern, so it's easier to make it not an anti-pattern.

I'm in favor of the COUNT(NULL) optimization as well, as one of my favorite programming tropes is "There is nothing faster than nothing".

The check seems lightweight enough to me. Applies clean and tests pass. Test coverage seems to cover all the cases.

Re: Have the planner convert COUNT(1) / COUNT(not_null_col) to COUNT(*)

От

David Rowley

Дата:

30 октября, 07:20:55

On Thu, 30 Oct 2025 at 16:40, Corey Huinker <corey.huinker@gmail.com> wrote:
> I'm in favor of the COUNT(NULL) optimization as well, as one of my favorite programming tropes is "There is nothing
fasterthan nothing".

I think it would be much more interesting to do that if it could be
detected in cases like:

SELECT COUNT(col) FROM table WHERE col IS NULL;

which might be a more realistic thing if the query without the WHERE
clause was part of a VIEW. However, we don't currently have any
infrastructure to detect when a column *is* NULL. There's only the
opposite with expr_is_nonnullable() or var_is_nonnullable().

This does make me wonder if constant-folding is too early to do this
transformation, as it currently happens before
add_base_clause_to_rel() therefore we can't really transform cases
such as:

SELECT count(nullable_col) FROM t WHERE nullable_col IS NOT NULL;

There might be a better spot in planning to do this at a point after
add_base_clause_to_rel() is called. It just has to happen before the
search for Aggrefs with the same aggtransfn in preprocess_aggref() as
it's too late to swap aggregate functions around when they've already
been grouped together with other aggs that can share the same
transition state. I'm just subtly aware about Tom's complaints with
the restriction_is_always_true() code as he thought it should go into
the constant folding code, where it mostly now is, per Richard's work
to put it there.

> The check seems lightweight enough to me. Applies clean and tests pass. Test coverage seems to cover all the cases.

Thanks for having a look and testing.

I've attached a very slightly revised version of the patch. I became
aware of a function named get_func_support(), which can be used rather
than fetching the pg_proc tuple from SysCache, which I was doing in
v1.  No other changes.

David

Вложения

v2-0001-Have-the-planner-replace-COUNT-ANY-with-COUNT-whe.patch

Re: Have the planner convert COUNT(1) / COUNT(not_null_col) to COUNT(*)

От

Corey Huinker

Дата:

01 ноября, 04:18:53

which might be a more realistic thing if the query without the WHERE
clause was part of a VIEW. However, we don't currently have any
infrastructure to detect when a column *is* NULL. There's only the
opposite with expr_is_nonnullable() or var_is_nonnullable().

But we'd still catch NULL constants, yes?

I've attached a very slightly revised version of the patch. I became
aware of a function named get_func_support(), which can be used rather
than fetching the pg_proc tuple from SysCache, which I was doing in
v1. No other changes.

David

The change is a big win for clarity. Applies clean. Passes.

Re: Have the planner convert COUNT(1) / COUNT(not_null_col) to COUNT(*)

От

"Matheus Alcantara"

Дата:

03 ноября, 23:38:35

Hi, thanks for working on this!

On Thu Oct 30, 2025 at 1:20 AM -03, David Rowley wrote:
> I've attached a very slightly revised version of the patch. I became
> aware of a function named get_func_support(), which can be used rather
> than fetching the pg_proc tuple from SysCache, which I was doing in
> v1.  No other changes.
>
I looked the code and it seems to be in a good shape, but I tried to
apply the v2 on top of e7ccb247b38 in master to run some tests and a
rebase is necessary.

git am v2-0001-Have-the-planner-replace-COUNT-ANY-with-COUNT-whe.patch
Applying: Have the planner replace COUNT(ANY) with COUNT(*), when possible
error: patch failed: contrib/postgres_fdw/expected/postgres_fdw.out:2975
error: contrib/postgres_fdw/expected/postgres_fdw.out: patch does not apply
error: patch failed: src/test/regress/expected/aggregates.out:1219
error: src/test/regress/expected/aggregates.out: patch does not apply
Patch failed at 0001 Have the planner replace COUNT(ANY) with COUNT(*), when possible

--
Matheus Alcantara

Re: Have the planner convert COUNT(1) / COUNT(not_null_col) to COUNT(*)

От

David Rowley

Дата:

04 ноября, 01:47:08

On Tue, 4 Nov 2025 at 09:38, Matheus Alcantara <matheusssilv97@gmail.com> wrote:
> I looked the code and it seems to be in a good shape, but I tried to
> apply the v2 on top of e7ccb247b38 in master to run some tests and a
> rebase is necessary.

Are you sure you've not got something else in your branch? It applies
ok here, and the CFbot isn't complaining either. CFBot's is based on
cf8be0225, which is 2 commits before the one you're trying, but
src/test/regress/expected/aggregates.out hasn't been changed since
2025-10-07.

David

Re: Have the planner convert COUNT(1) / COUNT(not_null_col) to COUNT(*)

От

David Rowley

Дата:

04 ноября, 01:49:15

On Sat, 1 Nov 2025 at 14:19, Corey Huinker <corey.huinker@gmail.com> wrote:
>>
>> which might be a more realistic thing if the query without the WHERE
>> clause was part of a VIEW. However, we don't currently have any
>> infrastructure to detect when a column *is* NULL. There's only the
>> opposite with expr_is_nonnullable() or var_is_nonnullable().
>
> But we'd still catch NULL constants, yes?

Yes. It could. I've left that part of the patch #ifdef'd out. I wasn't
planning on using it. I just left it there as an example for if
someone wanted to test it.

>  The change is a big win for clarity. Applies clean. Passes.

Thanks for checking.

David

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: Have the planner convert COUNT(1) / COUNT(not_null_col) to COUNT(*)

Have the planner convert COUNT(1) / COUNT(not_null_col) to COUNT(*)

Вложения

Re: Have the planner convert COUNT(1) / COUNT(not_null_col) to COUNT(*)

Re: Have the planner convert COUNT(1) / COUNT(not_null_col) to COUNT(*)

Вложения

Re: Have the planner convert COUNT(1) / COUNT(not_null_col) to COUNT(*)

Re: Have the planner convert COUNT(1) / COUNT(not_null_col) to COUNT(*)

Re: Have the planner convert COUNT(1) / COUNT(not_null_col) to COUNT(*)

Re: Have the planner convert COUNT(1) / COUNT(not_null_col) to COUNT(*)