Re: BUG #6668: hashjoin cost problem

Поиск

Список

Период

Сортировка

От	Tom Lane
Тема	Re: BUG #6668: hashjoin cost problem
Дата	31 мая 2012 г. 19:48:18
Msg-id	23375.1338504481@sss.pgh.pa.us обсуждение исходный текст
Ответ на	Re: BUG #6668: hashjoin cost problem (Postgres User <postgresuser@yahoo.com>)
Список	pgsql-bugs

Дерево обсуждения

Postgres User <postgresuser@yahoo.com> writes:
> Why is cost_hashjoin estimating 50 billion tuple comparisons for 10K rows of output though?

Well, if it hashes the smaller table, there's 100 million rows on the
outside, and each of them will probe one hash chain in the hash table.
If you're unlucky, each of those probes will hit a populated hash chain
with at least 1000 entries, leading to 100 billion comparisons.  I think
it might derate that worst-case by a factor of 2.  Now if you're lucky,
a lot of the outer tuples hit unpopulated hash chains and so the number
of comparisons is a lot less --- but in non-artificial examples,
that's not a very good bet to make.  The conservative assumption is that
both sides of the join have similar key distributions, so that the more
populated hash chains are also more likely to be probed.  The cost
estimate is therefore designed to discriminate against using an inner
relation with a non-flat distribution.

            regards, tom lane

В списке pgsql-bugs по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: BUG #6668: hashjoin cost problem