Re: costing of hash join
От | Tom Lane |
---|---|
Тема | Re: costing of hash join |
Дата | |
Msg-id | 27396.1388789455@sss.pgh.pa.us обсуждение исходный текст |
Ответ на | costing of hash join (Jeff Janes <jeff.janes@gmail.com>) |
Список | pgsql-hackers |
Jeff Janes <jeff.janes@gmail.com> writes: > I'm trying to figure out why hash joins seem to be systematically underused > in my hands. In the case I am immediately looking at it prefers a merge > join with both inputs getting seq scanned and sorted, despite the hash join > being actually 2 to 3 times faster, where inputs and intermediate working > sets are all in memory. I normally wouldn't worry about a factor of 3 > error, but I see this a lot in many different situations. The row > estimates are very close to actual, the errors is only in the cpu estimates. Can you produce a test case for other people to look at? What datatype(s) are the join keys? > A hash join is charged cpu_tuple_cost for each inner tuple for inserting it > into the hash table: Doesn't seem like monkeying with that is going to account for a 3x error. Have you tried using perf or oprofile or similar to see where the time is actually, rather than theoretically, going? regards, tom lane
В списке pgsql-hackers по дате отправления: