Re: costing of hash join

Поиск

Список

Период

Сортировка

От	Tom Lane
Тема	Re: costing of hash join
Дата	3 января 2014 г. 22:51:00
Msg-id	27396.1388789455@sss.pgh.pa.us обсуждение исходный текст
Ответ на	costing of hash join (Jeff Janes <jeff.janes@gmail.com>)
Список	pgsql-hackers

Дерево обсуждения

Jeff Janes <jeff.janes@gmail.com> writes:
> I'm trying to figure out why hash joins seem to be systematically underused
> in my hands.  In the case I am immediately looking at it prefers a merge
> join with both inputs getting seq scanned and sorted, despite the hash join
> being actually 2 to 3 times faster, where inputs and intermediate working
> sets are all in memory.  I normally wouldn't worry about a factor of 3
> error, but I see this a lot in many different situations.  The row
> estimates are very close to actual, the errors is only in the cpu estimates.

Can you produce a test case for other people to look at?

What datatype(s) are the join keys?

> A hash join is charged cpu_tuple_cost for each inner tuple for inserting it
> into the hash table:

Doesn't seem like monkeying with that is going to account for a 3x error.

Have you tried using perf or oprofile or similar to see where the time is
actually, rather than theoretically, going?
        regards, tom lane

В списке pgsql-hackers по дате отправления:

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Re: costing of hash join