BUG #16390: Regression between 12.2 and 11.6 on a recursive query : very slow and overestimation of rows

Поиск
Список
Период
Сортировка
От PG Bug reporting form
Тема BUG #16390: Regression between 12.2 and 11.6 on a recursive query : very slow and overestimation of rows
Дата
Msg-id 16390-e9866af103d5a03a@postgresql.org
обсуждение исходный текст
Ответы Re: BUG #16390: Regression between 12.2 and 11.6 on a recursive query : very slow and overestimation of rows  (Tom Lane <tgl@sss.pgh.pa.us>)
Список pgsql-bugs
The following bug has been logged on the website:

Bug reference:      16390
Logged by:          JC Boggio
Email address:      postgresql@thefreecat.org
PostgreSQL version: 12.2
Operating system:   Linux/Ubuntu
Description:

Hello,

Everything started on a big recursive query working fine with a filled
database (all tables contain at least a few rows). It executes in ~120ms.
But the same query, on a nearly empty DB takes ~10000ms on 12.2 and still
~100ms on 11.6 (same DB, vacuumed full and analyzed).

So I started digging and simplified the query to try to understand.

select version();
PostgreSQL 12.2 (Ubuntu 12.2-2.pgdg19.10+1) on x86_64-pc-linux-gnu, compiled
by gcc (Ubuntu 9.2.1-9ubuntu2) 9.2.1 20191008, 64-bit

CREATE DATABASE test;
\c test
create table a (id int);
create table b (id int);
create table c (id int);
insert into c values (1);
vacuum full analyze;

WITH RECURSIVE
tmpctr1 AS (
        SELECT 1 AS id
        FROM a
        JOIN b b1 ON a.id=b1.id
        JOIN b b2 ON a.id=b2.id  --(1)
    )
    ,rec1 AS (
        SELECT c.id, 1 AS level
        FROM c 
    
        UNION

        SELECT r.id, r.level+1
        FROM rec1 r
        LEFT JOIN tmpctr1 c ON r.id=c.id
        LEFT JOIN tmpctr1 c2 ON r.id=c2.id  --(2)
        WHERE r.level<20
        and (c.id is not null 
        or c2.id is not null  --(3)
        )
    )
select * from rec1;

Timing: 190ms

It is "only" 84ms on PG 11.6

                                                                QUERY PLAN
                                                               

---------------------------------------------------------------------------------------------------------------------------------------------
 CTE Scan on rec1  (cost=5480127.49..8057598.11 rows=128873531 width=8)
(actual time=187.368..187.407 rows=1 loops=1)
   CTE tmpctr1
     ->  Merge Join  (cost=539.35..7270.45 rows=414528 width=4) (actual
time=0.007..0.007 rows=0 loops=1)
           Merge Cond: (b2.id = a.id)
           ->  Sort  (cost=179.78..186.16 rows=2550 width=4) (actual
time=0.006..0.006 rows=0 loops=1)
                 Sort Key: b2.id
                 Sort Method: quicksort  Memory: 25kB
                 ->  Seq Scan on b b2  (cost=0.00..35.50 rows=2550 width=4)
(actual time=0.005..0.005 rows=0 loops=1)
           ->  Materialize  (cost=359.57..941.28 rows=32512 width=8) (never
executed)
                 ->  Merge Join  (cost=359.57..860.00 rows=32512 width=8)
(never executed)
                       Merge Cond: (a.id = b1.id)
                       ->  Sort  (cost=179.78..186.16 rows=2550 width=4)
(never executed)
                             Sort Key: a.id
                             ->  Seq Scan on a  (cost=0.00..35.50 rows=2550
width=4) (never executed)
                       ->  Sort  (cost=179.78..186.16 rows=2550 width=4)
(never executed)
                             Sort Key: b1.id
                             ->  Seq Scan on b b1  (cost=0.00..35.50
rows=2550 width=4) (never executed)
   CTE rec1
     ->  Recursive Union  (cost=0.00..5472857.05 rows=128873531 width=8)
(actual time=187.365..187.403 rows=1 loops=1)
           ->  Seq Scan on c  (cost=0.00..1.01 rows=1 width=8) (actual
time=187.360..187.363 rows=1 loops=1)
           ->  Merge Left Join  (cost=62937.61..289538.54 rows=12887353
width=8) (actual time=0.035..0.035 rows=0 loops=1)
                 Merge Cond: (r.id = c2.id)
                 Filter: ((c_1.id IS NOT NULL) OR (c2.id IS NOT NULL))
                 Rows Removed by Filter: 1
                 ->  Sort  (cost=10299.29..10314.83 rows=6218 width=12)
(actual time=0.030..0.031 rows=1 loops=1)
                       Sort Key: r.id
                       Sort Method: quicksort  Memory: 25kB
                       ->  Hash Right Join  (cost=0.26..9907.48 rows=6218
width=12) (actual time=0.022..0.022 rows=1 loops=1)
                             Hash Cond: (c_1.id = r.id)
                             ->  CTE Scan on tmpctr1 c_1
(cost=0.00..8290.56 rows=414528 width=4) (actual time=0.007..0.007 rows=0
loops=1)
                             ->  Hash  (cost=0.22..0.22 rows=3 width=8)
(actual time=0.005..0.005 rows=1 loops=1)
                                   Buckets: 1024  Batches: 1  Memory Usage:
9kB
                                   ->  WorkTable Scan on rec1 r
(cost=0.00..0.22 rows=3 width=8) (actual time=0.003..0.003 rows=1 loops=1)
                                         Filter: (level < 20)
                 ->  Materialize  (cost=52638.32..54710.96 rows=414528
width=4) (actual time=0.002..0.002 rows=0 loops=1)
                       ->  Sort  (cost=52638.32..53674.64 rows=414528
width=4) (actual time=0.001..0.001 rows=0 loops=1)
                             Sort Key: c2.id
                             Sort Method: quicksort  Memory: 25kB
                             ->  CTE Scan on tmpctr1 c2  (cost=0.00..8290.56
rows=414528 width=4) (actual time=0.000..0.000 rows=0 loops=1)
 Planning Time: 0.574 ms
 JIT:
   Functions: 36
   Options: Inlining true, Optimization true, Expressions true, Deforming
true
   Timing: Generation 9.996 ms, Inlining 12.520 ms, Optimization 116.679 ms,
Emission 57.894 ms, Total 197.090 ms
 Execution Time: 198.803 ms
(45 lignes)

Temps : 200,336 ms


You can see that in the query I have marked lines (1) (2) and (3).

If I comment out line (1), exec time comes down to ~2ms and if I comment out
lines (2) and (3) it falls to 42ms.

I'm completely lost on this. Tables "a" and "b" are empty, so "tmpctr1" is
empty also. But the estimated rows, costs and execution time are crazy.
How can the Seq scan on "c" (1 row) take 187ms ? And how does the Seq scan
on "a" and "b" both evaluate to 2550 rows ?
And why commenting out line (1) makes such a difference ?

Thanks for your help.


В списке pgsql-bugs по дате отправления:

Предыдущее
От: Tomas Vondra
Дата:
Сообщение: Re: BUG #16389: OOM on CTE since version 12
Следующее
От: Eudald Valcàrcel Lacasa
Дата:
Сообщение: Re: Queries getting older values (autocommit enabled)