Обсуждение: Plan not skipping unnecessary inner join

Поиск
Список
Период
Сортировка

Plan not skipping unnecessary inner join

От
"Matthew Nelson"
Дата:
I noticed something peculiar while optimizing complex views today. The query planner does not skip inner joins that, to
myunderstanding, can have no impact on the result. Am I missing a situation where these joins could impact the result?
 

The following demonstrates the problem without the complex views. It also demonstrates how the planner simplifies a
LEFTJOIN in the same situation. The left and right sides of an inner join could be swapped, obviously, but here I kept
theunique constraint on the right.
 



CREATE TABLE foo (
        id INTEGER PRIMARY KEY
);

CREATE TABLE bar (
        foo_id INTEGER NOT NULL REFERENCES foo
);

-- This simplifies to SELECT COUNT(*) FROM bar;
EXPLAIN SELECT COUNT(*)
FROM bar
LEFT JOIN foo ON bar.foo_id = foo.id;

-- This should simplify to SELECT COUNT(*) FROM bar WHERE foo_id IS NOT NULL;
-- The presence of a NOT NULL constraint on foo_id has no effect.
EXPLAIN SELECT COUNT(*)
FROM bar
INNER JOIN foo ON bar.foo_id = foo.id;



                         QUERY PLAN                          
-------------------------------------------------------------
 Aggregate  (cost=38.25..38.26 rows=1 width=8)
   ->  Seq Scan on bar  (cost=0.00..32.60 rows=2260 width=0)
(2 rows)

                               QUERY PLAN                                
-------------------------------------------------------------------------
 Aggregate  (cost=111.57..111.58 rows=1 width=8)
   ->  Hash Join  (cost=67.38..105.92 rows=2260 width=0)
         Hash Cond: (bar.foo_id_not_null = foo.id)
         ->  Seq Scan on bar  (cost=0.00..32.60 rows=2260 width=4)
         ->  Hash  (cost=35.50..35.50 rows=2550 width=4)
               ->  Seq Scan on foo  (cost=0.00..35.50 rows=2550 width=4)
(6 rows)

                                                      version                                                      
-------------------------------------------------------------------------------------------------------------------
 PostgreSQL 12.2 on x86_64-apple-darwin19.4.0, compiled by Apple clang version 11.0.3 (clang-1103.0.32.59), 64-bit
(1 row)



Re: Plan not skipping unnecessary inner join

От
David Wheeler
Дата:
> Am I missing a situation where these joins could impact the result?

Yes it will impact the number of rows in the result. for example if foo is empty then postgres is required to return no
results,regardless of how many rows are in bar. This is why it can ignore the table in the left join 

— David

> On 14 May 2020, at 1:44 pm, Matthew Nelson <postgres@binarykeep.com> wrote:
>
> I noticed something peculiar while optimizing complex views today. The query planner does not skip inner joins that,
tomy understanding, can have no impact on the result. Am I missing a situation where these joins could impact the
result?
>
> The following demonstrates the problem without the complex views. It also demonstrates how the planner simplifies a
LEFTJOIN in the same situation. The left and right sides of an inner join could be swapped, obviously, but here I kept
theunique constraint on the right. 
>
>
>
> CREATE TABLE foo (
>        id INTEGER PRIMARY KEY
> );
>
> CREATE TABLE bar (
>        foo_id INTEGER NOT NULL REFERENCES foo
> );
>
> -- This simplifies to SELECT COUNT(*) FROM bar;
> EXPLAIN SELECT COUNT(*)
> FROM bar
> LEFT JOIN foo ON bar.foo_id = foo.id;
>
> -- This should simplify to SELECT COUNT(*) FROM bar WHERE foo_id IS NOT NULL;
> -- The presence of a NOT NULL constraint on foo_id has no effect.
> EXPLAIN SELECT COUNT(*)
> FROM bar
> INNER JOIN foo ON bar.foo_id = foo.id;
>
>
>
>                         QUERY PLAN
> -------------------------------------------------------------
> Aggregate  (cost=38.25..38.26 rows=1 width=8)
>   ->  Seq Scan on bar  (cost=0.00..32.60 rows=2260 width=0)
> (2 rows)
>
>                               QUERY PLAN
> -------------------------------------------------------------------------
> Aggregate  (cost=111.57..111.58 rows=1 width=8)
>   ->  Hash Join  (cost=67.38..105.92 rows=2260 width=0)
>         Hash Cond: (bar.foo_id_not_null = foo.id)
>         ->  Seq Scan on bar  (cost=0.00..32.60 rows=2260 width=4)
>         ->  Hash  (cost=35.50..35.50 rows=2550 width=4)
>               ->  Seq Scan on foo  (cost=0.00..35.50 rows=2550 width=4)
> (6 rows)
>
>                                                      version
> -------------------------------------------------------------------------------------------------------------------
> PostgreSQL 12.2 on x86_64-apple-darwin19.4.0, compiled by Apple clang version 11.0.3 (clang-1103.0.32.59), 64-bit
> (1 row)
>
>




Re: Plan not skipping unnecessary inner join

От
"Matthew Nelson"
Дата:
But foo cannot be empty given the foreign key constraint on bar.foo_id.

I suppose that explains the planner's behavior. While today's planner can take a shortcut when it finds a unique
constraint,it does not check for foreign key constraints matching the join condition? So the LEFT JOIN simplification
ispossible but not the INNER JOIN simplification. 

Matthew

On Wed, May 13, 2020, at 11:45 PM, David Wheeler wrote:
> > Am I missing a situation where these joins could impact the result?
>
> Yes it will impact the number of rows in the result. for example if foo
> is empty then postgres is required to return no results, regardless of
> how many rows are in bar. This is why it can ignore the table in the
> left join
>
> — David
>
> > On 14 May 2020, at 1:44 pm, Matthew Nelson <postgres@binarykeep.com> wrote:
> >
> > I noticed something peculiar while optimizing complex views today. The query planner does not skip inner joins
that,to my understanding, can have no impact on the result. Am I missing a situation where these joins could impact the
result?
> >
> > The following demonstrates the problem without the complex views. It also demonstrates how the planner simplifies a
LEFTJOIN in the same situation. The left and right sides of an inner join could be swapped, obviously, but here I kept
theunique constraint on the right. 
> >
> >
> >
> > CREATE TABLE foo (
> >        id INTEGER PRIMARY KEY
> > );
> >
> > CREATE TABLE bar (
> >        foo_id INTEGER NOT NULL REFERENCES foo
> > );
> >
> > -- This simplifies to SELECT COUNT(*) FROM bar;
> > EXPLAIN SELECT COUNT(*)
> > FROM bar
> > LEFT JOIN foo ON bar.foo_id = foo.id;
> >
> > -- This should simplify to SELECT COUNT(*) FROM bar WHERE foo_id IS NOT NULL;
> > -- The presence of a NOT NULL constraint on foo_id has no effect.
> > EXPLAIN SELECT COUNT(*)
> > FROM bar
> > INNER JOIN foo ON bar.foo_id = foo.id;
> >
> >
> >
> >                         QUERY PLAN
> > -------------------------------------------------------------
> > Aggregate  (cost=38.25..38.26 rows=1 width=8)
> >   ->  Seq Scan on bar  (cost=0.00..32.60 rows=2260 width=0)
> > (2 rows)
> >
> >                               QUERY PLAN
> > -------------------------------------------------------------------------
> > Aggregate  (cost=111.57..111.58 rows=1 width=8)
> >   ->  Hash Join  (cost=67.38..105.92 rows=2260 width=0)
> >         Hash Cond: (bar.foo_id = foo.id)
> >         ->  Seq Scan on bar  (cost=0.00..32.60 rows=2260 width=4)
> >         ->  Hash  (cost=35.50..35.50 rows=2550 width=4)
> >               ->  Seq Scan on foo  (cost=0.00..35.50 rows=2550 width=4)
> > (6 rows)
> >
> >                                                      version
> > -------------------------------------------------------------------------------------------------------------------
> > PostgreSQL 12.2 on x86_64-apple-darwin19.4.0, compiled by Apple clang version 11.0.3 (clang-1103.0.32.59), 64-bit
> > (1 row)
> >
> >
>
>



Re: Plan not skipping unnecessary inner join

От
Ranier Vilela
Дата:
I redid the same tests with vanila postgres and with empty tables.
I'm surprised, why does the plan have 2550 rows in explain?

regards,
Ranier Vilela

Re: Plan not skipping unnecessary inner join

От
Justin Pryzby
Дата:
On Sun, May 17, 2020 at 09:32:47AM -0300, Ranier Vilela wrote:
> I redid the same tests with vanila postgres and with empty tables.
> I'm surprised, why does the plan have 2550 rows in explain?

That's the *estimated* rowcount.

The planner tends to ignore table statistics which say the table is empty,
since that can lead to a terrible plan if it's not true (stats are out of date
or autovacuum threshold not hit).

See also here
https://www.postgresql.org/message-id/20171110204043.GS8563%40telsasoft.com

-- 
Justin



Re: Plan not skipping unnecessary inner join

От
Ranier Vilela
Дата:
Em dom., 17 de mai. de 2020 às 10:31, Justin Pryzby <pryzby@telsasoft.com> escreveu:
On Sun, May 17, 2020 at 09:32:47AM -0300, Ranier Vilela wrote:
> I redid the same tests with vanila postgres and with empty tables.
> I'm surprised, why does the plan have 2550 rows in explain?

That's the *estimated* rowcount.

The planner tends to ignore table statistics which say the table is empty,
since that can lead to a terrible plan if it's not true (stats are out of date
or autovacuum threshold not hit).
Thanks for the explanation.

regards,
Ranier Vilela