Optimizer's issue
От | Vlad Arkhipov |
---|---|
Тема | Optimizer's issue |
Дата | |
Msg-id | 49E80A87.3090300@dc.baikal.ru обсуждение исходный текст |
Список | pgsql-performance |
I have a problem with a part of big query because of incorrect estimation. It's easy to emulate the case: create table a (id bigint, id2 bigint); create table b (id bigint, id2 bigint); insert into a (id, id2) select random() * 100000, random() * 100 from generate_series(1, 100000); insert into b (id, id2) select id, case when random() < 0.1 then random() * 100 else id2 end from a; alter table a alter column id set statistics 1000; alter table a alter column id2 set statistics 1000; alter table b alter column id set statistics 1000; alter table b alter column id2 set statistics 1000; analyze a; analyze b; explain analyze select * from a join b on b.id = a.id and b.id2 = a.id2; "Hash Join (cost=1161.00..3936.15 rows=1661 width=32) (actual time=424.865..1128.194 rows=91268 loops=1)" " Hash Cond: ((a.id = b.id) AND (a.id2 = b.id2))" " -> Seq Scan on a (cost=0.00..791.00 rows=100000 width=16) (actual time=0.013..197.908 rows=100000 loops=1)" " -> Hash (cost=791.00..791.00 rows=100000 width=16) (actual time=424.777..424.777 rows=100000 loops=1)" " -> Seq Scan on b (cost=0.00..791.00 rows=100000 width=16) (actual time=0.010..197.536 rows=100000 loops=1)" "Total runtime: 1305.121 ms"
В списке pgsql-performance по дате отправления: