Re: Weird seqscan node plan

Поиск
Список
Период
Сортировка
От Игорь Выскорко
Тема Re: Weird seqscan node plan
Дата
Msg-id 60645091574826822@sas1-defdbd0adc93.qloud-c.yandex.net
обсуждение исходный текст
Ответ на Weird seqscan node plan  (Игорь Выскорко <vyskorko.igor@yandex.ru>)
Ответы Re: Weird seqscan node plan
Список pgsql-general

>
> Try increasing the following parameters to 14 (or even 16, if you are not sure about number of tables involved):
>
> geqo_threshold = 14
>
> from_collapse_limit = 14
>
> join_collapse_limit = 14
>
> “about 12” is too close to default limit, 12.
>
> Regards,
>
> Igor Neyman

Hi Igor,

Is "set geqo = off;" not enough to fully disable geqo? 
I know what is geqo and know about limit when it is in action. Moreover, I actually tried to set these parameters to
100and it doesn't help
 

> Hi Игорь
> 
> I suggest running Explain and Analyze to see what the actual query results vs the planner are , 
> 
> Post the SQL code   

Hi Justin,
let me show 2 variants of "explain analyze" which differs only by actual rows returned by inner node (9th row):

1st case:  https://explain.depesz.com/s/lA4f
45358 rows actually returned and postgres decided to join each row of 45358 set with each row in yankee_foxtrot using
seqscan:
 
  Seq Scan on yankee_foxtrot foxtrot_bravo (cost=0.000..267.670 rows=7,467 width=13) (actual time=0.003..1.090
rows=7,467loops=45,358)
 
 and then filter it:
  Rows Removed by Join Filter: 338685224
it was an awful choice =)

2st: case: https://explain.depesz.com/s/zkKY
4130 rows returned and now index only scan in action


Why planner mistakes in determining the number of rows (every time planner expects only 1 row) in this step I can
understand- inner nodes do some joins (inner and outer with filtration) and it's hard to predict result.
 
But what I can't understand is why seq scan when it is always slower than index. Forget to mention that join condition
isby unique key. So, when planner expects only 1 row then it must join only one row from second table!
 

> 
> Also pull this part of the query out and run it by itself to see if the "Planner" changes  how  it joins these two
tables. 
 
> 
> May need to increase the statistics collected 
> https://www.postgresql.org/docs/12/planner-stats.html
> 
> also read this 
> https://www.postgresql.org/docs/12/explicit-joins.html

In this step of planning statistics can't help (let me know if I'm wrong). Using stats, planner knows everything about
secondtable (yankee_foxtrot) but It can only suggest approximate stats of first set (and it actually wrongs about it)
 
And yes I know about the possibility of explicit joining

> forgot to state 
> 
> Generally, it's index scan -> bitmap index scan -> sequential scan, the more rows  expected to be retrieved to number
ofrows in the table the more likely the planner will go to sequential scan 
 

The key word is "expected" here I assume and according to expectation of planner it must be only 1 row...


So, to conclude: I know how to make my query faster (how to exclude this "seq scan") but what I really what to
understand:WHY seq scan is in my plan? Why planner thinks it's the best choice?
 



В списке pgsql-general по дате отправления:

Предыдущее
От: github kran
Дата:
Сообщение: Re: PostGreSQL Replication and question on maintenance
Следующее
От: Andrei Zhidenkov
Дата:
Сообщение: Re: Weird seqscan node plan