Re: BUG #6668: hashjoin cost problem
От | Tom Lane |
---|---|
Тема | Re: BUG #6668: hashjoin cost problem |
Дата | |
Msg-id | 23375.1338504481@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | Re: BUG #6668: hashjoin cost problem (Postgres User <postgresuser@yahoo.com>) |
Список | pgsql-bugs |
Postgres User <postgresuser@yahoo.com> writes: > Why is cost_hashjoin estimating 50 billion tuple comparisons for 10K rows of output though? Well, if it hashes the smaller table, there's 100 million rows on the outside, and each of them will probe one hash chain in the hash table. If you're unlucky, each of those probes will hit a populated hash chain with at least 1000 entries, leading to 100 billion comparisons. I think it might derate that worst-case by a factor of 2. Now if you're lucky, a lot of the outer tuples hit unpopulated hash chains and so the number of comparisons is a lot less --- but in non-artificial examples, that's not a very good bet to make. The conservative assumption is that both sides of the join have similar key distributions, so that the more populated hash chains are also more likely to be probed. The cost estimate is therefore designed to discriminate against using an inner relation with a non-flat distribution. regards, tom lane
В списке pgsql-bugs по дате отправления: